Haskell function overloading - haskell

I have to overload function ordering. The argument of the function is a number and they have to be ordered by their value for the argument 0. This is what I tried:
Prelude>instance (Num a, Ord b) => Ord (a -> b) where f > g = f 0 > g 0
but it generates the error
Could not deduce (Eq (a -> b))
arising from the superclasses of an instance declaration
from the context (Num a, Ord b)
bound by the instance declaration at <interactive>:117:10-39
In the instance declaration for `Ord (a -> b)'
I would also like to instantiate the Ord class for lists. Lists ordering will be given by the comparison between first element of each list. For example [1,2] < [2,3] because 1 < 2.
instance Ord a => Ord [a] where
(h1:_) <= (h2:_) = h1 <= h2
This also generates the next error:
Ambiguous occurrence `<='
It could refer to either `Main.<=',
defined at C:\Users\user-name\Desktop\test.hs:2:8
or `Prelude.<=',
imported from `Prelude' at C:\Users\user-name\Desktop\test.hs:1:1
(and originally defined in `GHC.Classes')
I think I may have not understood very well the function overloading in Haskell. Maybe someone could explain me what I do wrong.

Don't do this. You're defining orphan instances here (instances that live neither in the module of the data nor the class they belong to); such instances are “invisible” in imports, which is generally a pain for larger project maintenance. (Sometimes you won't get around orphan instances, in particular when combining data and classes from unrelated packages; but functions, lists and Ord are all in the base library. If an instance isn't defined there, you can be pretty sure that there's a good reason!
If you do define a class instance, look at the class definition! Ord is defined thus:
class (Eq a) => Ord a where
...
that means, any instance of Ord must be an instance of Eq too. This is also what the compiler is telling you. So, if you do define this instance, you also need to add
instance (Num a, Eq b) => Eq (a -> b) where f==g = f 0 == g 0
This is the only instance that is compatible with your Ord. Frankly, it's just wrong though, since most functions that are deemed equal by this instance are not equal!
Haskell does not have overloading as such, at all. If you define a function <= in your module Main, this is a completely unrelated function from the standard <= aka Prelude.<=. You can define such a new function, but to use it you will need to disambiguate. The following should work:
instance Ord a => Ord [a] where
(h1:_) <= (h2:_) = h1 Main.<= h2
again though, this is not a good idea – you simply shouldn't define Main.<= in the first place, but put its definition right in the Ord instance.

Related

Instance signatures: constraints on methods

With InstanceSignatures, I can give a signature for a method within an instance decl.
The type signature in the instance declaration must be more polymorphic than (or the same as) the one in the class declaration, instantiated with the instance type.
I can see from this answer that the type part of the sig can't be more specific; but why couldn't you add extra constraints? After all you can put constraints on the instance decl that will make the instance more specific than "the class declaration, instantiated with the instance type".
Addit: (In response to the first couple of comments.) To explain "you can put constraints on the instance decl ... instance more specific": the OVERLAPPABLE instance head here alleges it provides addNat for all types a, b, c. But it doesn't: it only provides if a is of the form SNat a', c is of the form SNat c', etc. I'm asking why I can't similarly restrict the types at the method level?
For example [adapted from Hughes 1999] (more realistically this could be a BST/Rose tree, etc):
data AscList a = AscList [a] -- ascending list
instance Functor AscList where -- constructor class
fmap f (AscList xs) = AscList $ sort $ fmap f xs
-- :: Ord b => (a -> b) -> AscList a -> AscList b -- inferred
Without an instance signature GHC complains no instance for (Ord b). With the signature GHC complains the instantiated sig from the class is more polymorphic than the one given.
This (excellent) answer explains the machinery in the instance dictionary. I can see the entry in the dictionary for the method has a type set from the number of constraints for the method (if any) in the class decl. There's no room to put an extra dictionary/parameter for the method.
That seems merely that the implementation didn't foresee a need. Is there a deeper/more theory-based reason against?
(BTW I'm not ever-so convinced by Hughes' approach: that would want both Ord a => WFT(AscList a) and Ord b => WFT(AscList b). But there's no need to require the incoming (AscList a) is ascending. Other constructor classes for AscList perhaps don't need an Ord constraint anywhere.)
If you allow adding constraints to instance methods that aren't already in the class method, a polymorphic function using type classes might no longer typecheck when you specialize it.
For example, consider your AscList instance above, and the following usage of Functor:
data T f = T (f (IO ())) -- f applied to a non-Ord
example :: Functor f => T f -> T f
example (T t) = T (fmap (\x -> x >> print 33) t)
The type of example says you can instantiate f with any functor, such as f = AscList.
But that makes no sense because it would try to sort an AscList (IO ()).
The only way to tell that something goes wrong when we specialize example here is to read its definition, to find whether uses of fmap are still valid, and that goes against modularity.
The type class constraints in a type signature do not and should not record how their methods are used. Conversely, that means that instances do not get to add constraints to their method implementations.

In Haskell, express constraints on class header or class methods?

I've recently realized this thing:
On the one hand:
Constraints specified on a class header, must be specified again on an instance of that class, but any use of that class as a constraint somewhere else does not need to reimport the class constraints. They are implicitly satisfied.
class (Ord a) => ClassA a where methodA :: a -> Bool -- i decided to put constraint (Ord a) in the class header
instance (Ord a) => ClassA a where methodA x = x <= x -- compiler forces me to add (Ord a) => in the front
class OtherClassA a where otherMethodA :: a -> Bool
instance (ClassA a) => OtherClassA a where otherMethodA x = x <= x && methodA x -- i don't need to specify (Ord a) so it must be brought implicitly in context
On the other hand:
A constraint specified in a class method, does not need to be specified again on an instance of that class, but any use of that class as a constraint somewhere else, needs to reimport specific constraints for method used.
class ClassB a where methodB :: (Ord a) => a -> Bool -- i decided to put constraint (Ord a) in the method
instance ClassB a where methodB x = x <= x -- i don't need to specify (Ord a) so it must be implicitly in context
class OtherClassB a where otherMethodB :: a -> Bool
instance (ClassB a, Ord a) => OtherClassB a where otherMethodB = methodB -- compiler forced me to add (Ord a)
What is the motivation for those behaviors? Would it not be preferrable to be explicit about constraints at all times?
More concretely when I have a set of conditions I know all methods in a type class should satisfy, should I write the conditions, in the type class header, or in front of each method?
Should I write constraints in a type class definition at all?
Short Answer
Here's my general advice on constraints in class declarations and instance definitions. See below for a longer explanation and a detailed description of your examples.
If you have classes with a logical relationship such that it is logically impossible for a type to belong to class Base without belonging to class Super, use a constraint in the class declaration, like so:
class Super a => Base a where ...
Some example:
-- all Applicatives are necessarily Functors
class Functor f => Applicative f where ...
-- All orderable types can also be tested for equality
class Eq f => Ord f where ...
-- Every HTMLDocument also supports Document methods
class Document doc => HTMLDocument doc where ...
Avoid writing instances that apply to all types, with or without constraints. With a few exceptions, these usually point to a design flaw:
-- don't do this
instance SomeClass1 a
-- or this
instance (Eq a) => SomeClass1 a
Instances for higher-order types make sense though, and use whatever constraints are necessary for the instance to compile:
instance (Ord a, Ord b) => Ord (a, b) where
compare (x1,x2) (y1,y2) = case compare x1 x2 of
LT -> LT
GT -> GT
EQ -> compare x2 y2
Don't use constraints on class methods, except where a class should support different subsets of methods on different types, depending on the constraints available.
Long Answer
Constraints in class declarations and instance definitions have different meanings and different purposes. A constraint in a class declaration like:
class (Big a) => Small a
defines Big as a "superclass" of Small and represents a type-level claim of a logical necessity: any type of class Small is necessarily also a type of class Big. Having such a constraint improves type correctness (since any attempt to define a Small instance for a type a that doesn't also have a Big instance -- a logical inconsistency -- will be rejected by the compiler) and convenience, since a Small a constraint will automatically make available the Big class interface in addition to the Small interface.
As a concrete example, in modern Haskell, Functor is a superclass of Applicative which is a superclass of Monad. All Monads are Applicatives and all Applicatives are Functors, so this superclass relationship reflects a logical relationship between these collections of types, and also provides the convenience of being able to use the monad (do-notation, >>=, and return), applicative (pure and <*>) and functor (fmap or <$>) interfaces using only a Monad m constraint.
A consequence of this superclass relationship is that any Monad instance must also be accompanied by an Applicative and Functor instance to provide evidence to the compiler that the necessary superclass constraints are satisfied.
In contrast, a constraint in an instance definition introduces a dependency of the specific, defined instance on another instance. Most commonly, I see this used to define instances for classes of high-order types, like the Ord instance for lists:
instance Ord a => Ord [a] where ...
That is, an Ord [a] instance can be defined for any type a using lexicographic ordering of the list, provided that the type a itself can be ordered. The constraint here does not (and indeed could not) apply to all Ord types. Rather, the instance definition provides an instance for all lists by introducing a dependency on an instance for the element type -- it says that an Ord [a] instance is available for any type a that has an Ord a instance available.
Your examples are somewhat unusual, as one doesn't normally define an instance:
instance SomeClass a where ...
that applies to all types a, with or without additional constraints.
Nonetheless, what's happening is that:
class (Ord a) => ClassA a
is introducing a logical type-level fact, namely that all types of class ClassA are also of class Ord. Then, you are presenting an instance of ClassA applicable to all types:
instance ClassA a
But, this presents a problem to the compiler. Your class declaration has stated that is it logically necessary that all types of ClassA also belong to class Ord, and the compiler requires proof of the Ord a constraint for any instance ClassA a that you define. By writing instance ClassA a, you are making the bold claim that all types are of ClassA, but the compiler has no evidence that all classes have the necessary Ord a instances. For this reason, you must write:
instance (Ord a) => ClassA a
which, in words, says "all types a have an instance of ClassA provided an Ord a instance is also available". The compiler accepts this as proof that you are only defining instances for the those types a that have the requisite Ord a instance.
When you go on to define:
class OtherClassA a where
otherMethodA :: a -> Bool
instance (ClassA a) => OtherClassA a where
otherMethodA x = x <= x && methodA x
since OtherClassA has no superclass, there is no logical necessity that types of this class are also of class Ord, and the compiler will require no proof of this. In your instance definition however, you define an instance applicable to all types whose implementation requires Ord a, as well as ClassA a. Fortunately, you've provided a ClassA a constraint, and because Ord is a superclass of ClassA, it is a logical necessity that any a with a ClassA a constraint also has an Ord a constraint, so the compiler is satisfied that a has both required instances.
When you write:
class ClassB a where
methodB :: (Ord a) => a -> Bool
you are doing something unusual, and the compiler tries to warn by refusing to compile this unless you enable the extension ConstrainedClassMethods. What this definition says is that there is no logical necessity that types of class ClassB also be of class Ord, so you're free to define instances that lack the require instance. For example:
instance ClassB (Int -> Int) where
methodB _ = False
which defines an instance for functions Int -> Int (and this type has no Ord instance). However, any attempt to use methodB on such a type will demand an Ord instance:
> methodB (*(2::Int))
... • No instance for (Ord (Int -> Int)) ...
This can be useful if there are several methods and only some of them require constraints. The GHC manual gives the following example:
class Seq s a where
fromList :: [a] -> s a
elem :: Eq a => a -> s a -> Bool
You can define sequences Seq s a with no logical necessity that the elements a be comparable. But, without Eq a, you're only allowed to use a subset of the methods. If you try to use a method that needs Eq a with a type a that doesn't have such an instance, you'll get an error.
Anyway, your instance:
instance ClassB a where
methodB x = x <= x
defines an instance for all types (without requiring any evidence of Ord a, since there's no logical necessity here), but you can only use methodB on the subset of types with an Ord instance.
In your final example:
class OtherClassB a where
otherMethodB :: a -> Bool
there's no logical necessity that a type of class OtherClassB also be a type of class Ord, and there's no requirement that otherMethodB only be used with types having an Ord a instance. You could, if you wanted, define the instance:
instance OtherClassB a where
otherMethodB _ = False
and it would compile fine. However, by defining the instance:
instance OtherClassB a where
otherMethodB = methodB
you are providing an instance for all types whose implementation uses methodB and so requires ClassB. If you modify this to read:
instance (ClassB a) => OtherClassB a where
otherMethodB = methodB
the compiler still isn't satified. The specific method methodB requires an Ord a instance, but since Ord is not a superclass of ClassB, there is no logical necessity that the constraint ClassB a implies Ord a, so you must provide additionl evidence to the compiler that an Ord a instance is available. By writing:
instance (ClassB a, Ord a) => OtherClassB a where
otherMethodB = methodB
you are providing an instance that requires ClassB a (to run methodB) and Ord a (because methodB has it as an additional requirement), so you need to tell the compiler that this instance applies to all types a provided both ClassB a and Ord a instances are available. The compiler is satisfied with this.
You Don't Need Type Classes to Delay Concrete Types
From your examples and follow-up comments, it sounds like you're (mis)using type classes to support a particular style of programming that avoids committing to concrete types until absolutely necessary.
(As an aside, I used to think this style was a good idea, but I've gradually come around to thinking it's mostly pointless. Haskell's type system makes refactoring so easy that there's little risk to committing to concrete types, and concrete programs tend to be mostly easier to read and write than abstract programs. However, many people have used this style of programming profitably, and I can think of at least one high-quality library (lens) that takes it to absolute extremes very effectively. So, no judgement!)
Anyway, this style of programming is generally better supported by writing top-level polymorphic functions and placing the needed constraints on the functions. There is usually no need (and no point) in defining new type classes. This was what #duplode was saying in the comments. You can replace:
class (Ord a) => ClassA where method :: a -> Bool
instance (Ord a) => ClassA where methodA x = x <= x
with the simpler top-level function definition:
methodA :: (Ord a) => a -> Bool
methodA x = x <= x
because the class and instance serve no purpose. The main point of type classes is to provide ad hoc polymorphism, to allow you to have a single function (methodA) that has different implementations for different types. If there's only one implementation for all types, that's just a plain old parametric polymorphic function, and no type class is needed.
Nothing changes if there are multiple methods, and usually nothing changes if there are multiple constraints. If your philosophy is that data types should be characterized only by the properties they satisfy rather than by what they are, then the flip side of that is that functions should be typed to demand of their argument types only the properties that they need. If they demand more than they need, they are prematurely committing to a more concrete type than necessary.
So, a class for, say, an orderable numeric key type with a printable representation:
class (Ord a, Num a, Show a) => Key a where
firstKey :: a
nextKey :: a -> a
sortKeys :: [a] -> [a]
keyLength :: a -> Int
and a single instance:
instance (Ord a, Num a, Show a) => Key a where
firstKey = 1
nextKey x = x + 1
sortKeys xs = sort xs
keyLength k = length (show k)
is more idiomatically written as a set of functions that constrain the type only based on the properties they require:
firstKey :: (Num key) => key
firstKey = 1
nextKey :: (Num key) => key -> key
nextKey = (+1)
sortKeys :: (Ord key) => [key] -> [key]
sortKeys = sort
keyLength :: (Show key) => key -> Int
keyLength = length . show
On the other hand, if you find it helpful to have a formal "name" for an abstract type and prefer the compiler's help in enforcing use of this type instead of just using type variables like "key" with evocative names, I guess you can use type classes for this purpose. However, your type classes probably shouldn't have any methods. You want to write:
class (Ord a, Num a, Show a) => Key a
and then a bunch of top-level functions that use the type class.
firstKey :: (Key k) => k
firstKey = 1
nextKey :: (Key k) => k -> k
nextKey = (+1)
sortKeys :: (Key k) => [k] -> [k]
sortKeys = sort
keyLength :: (Show k) => k -> Int
keyLength = length . show
Your entire program can be written this way, and no instances are actually needed until you get down to choosing your concrete types and documenting them all in one place. For example, in your Main.hs program, you could commit to an Int key by giving an instance for a concrete type and using it:
instance Key Int
main = print (nextKey firstKey :: Int)
This concrete instance also avoids the need for extensions like undecidable instances and warning about fragile bindings.

Why is context reduction necessary?

I've just read this paper ("Type classes: an exploration of the design space" by Peyton Jones & Jones), which explains some challenges with the early typeclass system of Haskell, and how to improve it.
Many of the issues that they raise are related to context reduction which is a way to reduce the set of constraints over instance and function declarations by following the "reverse entailment" relationship.
e.g. if you have somewhere instance (Ord a, Ord b) => Ord (a, b) ... then within contexts, Ord (a, b) gets reduced to {Ord a, Ord b} (reduction does not always shrink the number of constrains).
I did not understand from the paper why this reduction was necessary.
Well, I gathered it was used to perform some form of type checking. When you have your reduced set of constraint, you can check that there exist some instance that can satisfy them, otherwise it's an error. I'm not too sure what the added value of that is, since you would notice the problem at the use site, but okay.
But even if you have to do that check, why use the result of reduction inside inferred types? The paper points out it leads to unintuitive inferred types.
The paper is quite ancient (1997) but as far as I can tell, context reduction is still an ongoing concern. The Haskell 2010 spec does mention the inference behaviour I explain above (link).
So, why do it this way?
I don't know if this is The Reason, necessarily, but it might be considered A Reason: in early Haskell, type signatures were only permitted to have "simple" constraints, namely, a type class name applied to a type variable. Thus, for example, all of these were okay:
Ord a => a -> a -> Bool
Eq a => a -> a -> Bool
Graph gr => gr n e -> [n]
But none of these:
Ord (Tree a) => Tree a -> Tree a -> Bool
Eq (a -> b) => (a -> b) -> (a -> b) -> Bool
Graph Gr => Gr n e -> [n]
I think there was a feeling then -- and still today, as well -- that allowing the compiler to infer a type which one couldn't write manually would be a bit unfortunate. Context reduction was a way of turning the above signatures either into ones that could be written by hand as well or an informative error. For example, since one might reasonably have
instance Ord a => Ord (Tree a)
in scope, we could turn the illegal signature Ord (Tree a) => ... into the legal signature Ord a => .... On the other hand, if we don't have any instance of Eq for functions in scope, one would report an error about the type which was inferred to require Eq (a -> b) in its context.
This has a couple of other benefits:
Intuitively pleasing. Many of the context reduction rules do not change whether the type is legal, but do reflect things humans would do when writing the type. I'm thinking here of the de-duplication and subsumption rules that let you turn, e.g. (Eq a, Eq a, Ord a) into just Ord a -- a transformation one definitely would want to do for readability.
This can frequently catch stupid errors; rather than inferring a type like Eq (Integer -> Integer) => Bool which can't be satisfied in a law-abiding way, one can report an error like Perhaps you did not apply a function to enough arguments?. Much friendlier!
It becomes the compiler's job to pinpoint what went wrong. Instead of inferring a complicated context like Eq (Tree (Grizwump a, [Flagle (Gr n e) (Gr n' e') c])) and complaining that the context is not satisfiable, it instead is forced to reduce this to the constituent constraints; it will instead complain that we couldn't determine Eq (Grizwump a) from the existing context -- a much more precise and actionable error.
I think this is indeed desirable in a dictionary passing implementation. In such an implementation, a "dictionary", that is, a tuple or record of functions is passed as implicit argument for every type class constraint in the type of the applied function.
Now, the question is simply when and how those dictionaries are created. Observe that for simple types like Int by necessity all dictionaries for whatever type class Int is an instance of will be a constant.
Not so in the case of parameterized types like lists, Maybe or tuples. It is clear that to show a tuple, for instance, the Show instances of the actual tuple elements need to be known. Hence such a polymorphic dictionary cannot be a constant.
It appears that the principle guiding the dictionary passing is such that only dictionaries for types that appear as type variables in the type of the applied function are passed. Or, to put it differently: no redundant information is replicated.
Consider this function:
f :: (Show a, Show b) => (a,b) -> Int
f ab = length (show ab)
The information that a tuple of show-able components is also showable, thus a constraint like Show (a,b) needs not to appear when we already know (Show a, Show b).
An alternative implementation would be possible, though, where the caller .would be responsible to create and pass dictionaries. This could work without context reduction, such that the type of f would look like:
f :: Show (a,b) => (a,b) -> Int
But this would mean that the code to create the tuple dictionary would have to be repeated on every call site. And it is easy to come up with examples where the number of necessary constraints actually increases, like in:
g :: (Show (a,a), Show(b,b), Show (a,b), Show (b, a)) => a -> b -> Int
g a b = maximum (map length [show (a,a), show (a,b), show (b,a), show(b,b)])
It is instructive to implement a type class/instance system with actual records that are explicitly passed. For example:
data Show' a = Show' { show' :: a -> String }
showInt :: Show' Int
showInt = Show' { show' = intshow } where
intshow :: Int -> String
intshow = show
Once you do this you will probably easily recognize the need for "context reduction".

What does the => symbol mean in Haskell?

I'm new to Haskell and, in general, to functional programming, and I'm a bit uncomfortable with its syntax.
In the following code what does the => denote? And also (Num a, Ord a)?
loop :: (Num a, Ord a) => a -> (t -> t) -> t -> t
This is a typeclass constraint; (Num a, Ord a) => ... means that loop works with any type a that is an instance of the Num and Ord typeclasses, corresponding to numeric types and ordered types respectively. Basically, you can think of loop as having the type on the right hand side of the =>, except that a is required to be an instance of Num and Ord.
You can think of typeclasses as basically similar to OOP interfaces (but they're not the same thing!) — they encapsulate a set of definitions which any instance must support, and generic code can be written using these definitions. For instance, Num includes numeric operations like addition and multiplication, while Ord includes less than, greater than, and so on.
For more information on typeclasses, see this introduction from Learn You a Haskell.
=> separates two parts of a type signature:
On the left, typeclass constraints
On the right, the actual type
So you can think of (Num a, Ord a) => a -> (t -> t) -> t -> t as meaning "the type is a -> (t -> t) -> t -> t and also there must be a Num instance for a and an Ord instance for a".
For more on typeclasses see http://www.learnyouahaskell.com/types-and-typeclasses
One way to think about it is that Ord a and Num a are additional inputs to the function. They are a special kind of input though: dictionaries. When you use this function with a particular type a, there must also be dictionaries available for the Ord and Num operations on the type a as well.
Any function that makes use of a function with dictionary inputs must also have the same dictionary inputs.
foo :: (Num a, Ord a) => a -> t
foo x = loop x someFunc someT
However, you do not have to explicitly pass these dictionaries around. Haskell will take care of that for you, assuming there is a dictionary available. You can create a dictionary with a typeclass instance.
instance Num MyType with
x + y = ...
x - y = ...
...
This creates a dictionary for the Num operations on MyType, therefore MyType can be used anywhere that Num a is a required input (assuming it satisfies the other requirements, of course).
On the left hand side of the => you declare constraints for the types that are used on the right.
In the example you give, it means that a is constrained to being an instance of both the Ord type class and the Num type class.

Why do I have to specify typeclass in function if it was declared in data definition?

If I have an ADT with specified typeclass restrictions I still have to specify the same typeclass for each function using this data type. What the reason for this and how can I reduce unnecessary typing?
E.g.:
data Eq a => C a = V a
g :: C a -> Bool
g (V a) = a == a
I got:
test.hs:32:13:
No instance for (Eq a)
arising from a use of `=='
In the expression: a == a
In an equation for `g': g (V a) = a == a
Failed, modules loaded: none.
While:
g :: Eq a => C a -> Bool
Works fine, but if I have a long chain of functions it becomes a burden to specify a typeclass everytime:
f :: Eq a => C a -> Bool
f a = g a
It's generally considered a bad idea to put a typeclass restriction on your ADT. Instead, leave it off and code normally using (==) wherever you have to. Your Eq a dependency will percolate up some of your functions and not others.
Because the Haskell Report says so, basically. It's generally regarded as somewhat silly. Quoth the GHC User Guide:
All this behaviour contrasts with Haskell 98's peculiar treatment of contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). In Haskell 98 the definition
data Eq a => Set' a = MkSet' [a]
gives MkSet' the same type as MkSet above. But instead of making available an (Eq a) constraint, pattern-matching on MkSet' requires an (Eq a) constraint! GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, GHC's behaviour is much more useful, as well as much more intuitive.
Putting contexts on regular data definitions is discouraged and may (will?) be removed from the language at some point. Either put the context only on the function (which is what actually needs it, anyhow), or use GADT-style syntax to get the behavior you expected.

Resources