According to Harper (https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/), it seems that Type Classes simply do not offer the same level of abstraction that Modules offer and I'm having a hard time exactly figuring out why. And there are no examples in that link, so it's hard for me to see the key differences. There are also other papers on how to translate between Modules and Type Classes (http://www.cse.unsw.edu.au/~chak/papers/modules-classes.pdf), but this doesn't really have anything to do with the implementation in the programmer's perspective (it just says that there isn't something one can do that the other can't emulate).
Specifically, in the first link:
The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference.
I don't understand either. A type can implement a type class in more than 1 way in ML? How would you have the integers ordered by divisibility by example without creating a new type? In Haskell, you would have to do something like use data and have the instance Ord to offer an alternative ordering.
And the second one, aren't the two are distinct in Haskell?
Specifying "when such a specification should be used during type inference" can be done by something like this:
blah :: BlahType b => ...
where BlahType is the class being used during the type inference and NOT the implementing class. Whereas, "how a type implements a type class" is done using instance.
Can some one explain what the link is really trying to say? I'm just not quite understanding why Modules would be less restrictive than Type Classes.
To understand what the article is saying, take a moment to consider the Monoid typeclass in Haskell. A monoid is any type, T, which has a function mappend :: T -> T -> T and identity element mempty :: T for which the following holds.
a `mappend` (b `mappend` c) == (a `mappend` b) `mappend` c
a `mappend` mempty == mempty `mappend` a == a
There are many Haskell types which fit this definition. One example that springs immediately to mind are the integers, for which we can define the following.
instance Monoid Integer where
mappend = (+)
mempty = 0
You can confirm that all of the requirements hold.
a + (b + c) == (a + b) + c
a + 0 == 0 + a == a
Indeed, the those conditions hold for all numbers over addition, so we can define the following as well.
instance Num a => Monoid a where
mappend = (+)
mempty = 0
So now, in GHCi, we can do the following.
> mappend 3 5
8
> mempty
0
Particularly observant readers (or those with a background in mathemetics) will probably have noticed by now that we can also define a Monoid instance for numbers over multiplication.
instance Num a => Monoid a where
mappend = (*)
mempty = 1
a * (b * c) == (a * b) * c
a * 1 == 1 * a == a
But now the compiler encounters a problem. Which definiton of mappend should it use for numbers? Does mappend 3 5 equal 8 or 15? There is no way for it to decide. This is why Haskell does not allow multiple instances of a single typeclass. However, the issue still stands. Which Monoid instance of Num should we use? Both are perfectly valid and make sense for certain circumstances. The solution is to use neither. If you look Monoid in Hackage, you will see that there is no Monoid instance of Num, or Integer, Int, Float, or Double for that matter. Instead, there are Monoid instances of Sum and Product. Sum and Product are defined as follows.
newtype Sum a = Sum { getSum :: a }
newtype Product a = Product { getProduct :: a }
instance Num a => Monoid (Sum a) where
mappend (Sum a) (Sum b) = Sum $ a + b
mempty = Sum 0
instance Num a => Monoid (Product a) where
mappend (Product a) (Product b) = Product $ a * b
mempty = Product 1
Now, if you want to use a number as a Monoid you have to wrap it in either a Sum or Product type. Which type you use determines which Monoid instance is used. This is the essence of what the article was trying to describe. There is no system built into Haskell's typeclass system which allows you to choose between multiple intances. Instead you have to jump through hoops by wrapping and unwrapping them in skeleton types. Now whether or not you consider this a problem is a large part of what determines whether you prefer Haskell or ML.
ML gets around this by allowing multiple "instances" of the same class and type to be defined in different modules. Then, which module you import determines which "instance" you use. (Strictly speaking, ML doesn't have classes and instances, but it does have signatures and structures, which can act almost the same. For amore in depth comparison, read this paper).
Related
I'm playing around with rewriting simple functions in different ways and I clearly misunderstand some core concepts. Is there a better way to work with limited types like these?
mlength :: Monoid m => m -> Int
mlength mempty = 0
mlength (l <> r) = mlength l + mlength r
It fails compilation with the following error:
Parse error in pattern: l <> r
I can see that my usage of <> is misguided because there are multiple correct matches for l and r. Even though it looks like it doesn't matter which value is assigned, a value still has to be assigned in the end. Maybe there's a way for me to assert this decision for specific Monoid instances?
"ab" == "" <> "ab"
"ab" == "a" <> "b"
"ab" == "ab" <> ""
A monoid, in the general case, has no notion of length. Take for instance Sum Int, which is Int equipped with addition for its monoidal operation. We have
Sum 3 <> Sum 4 = Sum 7 = Sum (-100) <> Sum 7 <> Sum (100)
What should be its "length"? There is no real notion of length here, since the underlying type is Int, which is not a list-like type.
Another example: Endo Int which is Int -> Int equipped with composition. E.g.
Endo (\x -> x+1) <> Endo (\x -> x*2) = Endo (\x -> 2*x+1)
Again, no meaningful "length" can be defined here.
You can browse Data.Monoid and see other examples where there is no notion of "length".
Const a is also a (boring) monoid with no length.
Now, it is true that lists [a] form a monoid (the free monoid over a), and length can indeed be defined there. Still, this is only a particular case, which does not generalize.
The Semigroup and Monoid interfaces provide a means to build up values, (<>). They don't, however, give us a way to break down or otherwise extract information from values. That being so, a length generalised beyond some specific type requires a different abstraction.
As discussed in the comments to chi's answer, while Data.Foldable offers a generalised length :: Foldable t => t a -> Int, it isn't quite what you were aiming at -- in particular, the connection between Foldable and Monoid is that foldable structures can be converted to lists/the free monoid, and not that foldables themselves are necessarily monoids.
One other possibility, which is somewhat obscure but closer to the spirit of your question, is the Factorial class from the monoid-subclasses package, a subclass of Semigroup. It is built around factors :: Factorial m => m -> [m], which splits a value into irreducible factors, undoing what sconcat or mconcat do. A generalised length :: Factorial m => m -> Int can then be defined as the length of the list of factors. In any case, note that we still end up needing a further abstraction on the top of Semigroup/Monoid.
I am a mathematician who works a lot with category theory, and I've been using Haskell for a while to perform certain computations etc., but I am definitely not a programmer. I really love Haskell and want to become much more fluent in it, and the type system is something that I find especially great to have in place when writing programs.
However, I've recently been trying to implement category theoretic things, and am running into problems concerning the fact that you seemingly can't have class method laws in Haskell. In case my terminology here is wrong, what I mean is that I can write
class Monoid c where
id :: c -> c
m :: c -> c -> c
but I can't write some law along the lines of
m (m x y) z == m x $ m y z
From what I gather, this is due to the lack of dependent types in Haskell, but I'm not sure how exactly this is the case (having now read a bit about dependent types). It also seems that the convention is just to include laws like this in comments and hope that you don't accidentally cook up some instance that doesn't satisfy them.
How should I change my approach to Haskell to deal with this problem? Is there a nice mathematical/type-theoretic solution (for example, require the existence of an associator that is an isomorphism (though then the question is, how do we encode isomorphisms without a law?)); is there some 'hack' (using extensions such as DataKinds); should I be drastic and switch to using something like Idris instead; or is the best response to just change the way I think about using Haskell (i.e. accept that these laws can't be implemented in a Haskelly way)?
(bonus) How exactly does the lack of laws come from not supporting dependent types?
You want to require that:
m (m x y) z = m x (m y z) -- (1)
But to require this you need a way to check it. So you, or your compiler (or proof assistant), need to construct a proof of this. And the question is, what type is a proof of (1)?
One could imagine some Proof type but then maybe you could just construct a proof that 0 = 0 instead of a proof of (1) and both would have type Proof. So you’d need a more general type. I can’t decide how to break up the rest of the question so I’ll go for a super brief explanation of the Curry-Howard isomorphism followed by an explanation of how to prove two things are equal and then how dependent types are relevant.
The Curry-Howard isomorphism says that propositions are isomorphic to types and proofs are isomorphic to programs: a type corresponds to a proposition and a proof of that proposition corresponds to a program constructing a value inhabiting that type. Ignoring how many propositions might be expressed as types, an example would be that the type A * B (written (A, B) in Haskell) corresponds to the proposition “A and B,” while the type A + B (written Either A B in Haskell) corresponds to the proposition “A or B.” Finally the type A -> B corresponds to “A implies B,” as a proof of this is a program which takes evidence of A and gives you evidence of B. One should note that there isn’t a way to express not A but one could imagine adding a type Not A with builtins of type Either a (Not a) for the law of the excluded middle as well as Not (Not a) -> a, and a * Not a -> Void (where Void is a type which cannot be inhabited and therefore corresponds to false), but then one can’t really run these programs to get constructivist proofs.
Now we will ignore some realities of Haskell and imagine that there aren’t ways round these rules (in particular undefined :: a says everything is true, and unsafeCoerce :: a -> b says that anything implies anything else, or just other functions that don’t return where their existence does not imply the corresponding proof).
So we know how to combine propositions but what might a proposition be? Well one could be to say that two types are equal. In Haskell this corresponds to the GADT
data Eq a b where Refl :: Eq c c
Where this constructor corresponds to the reflexive property of equality.
[side note: if you’re still interested so far, you may be interested to look up Voevodsky’s univalent foundations, depending on how much the idea of “Homotopy type theory” interests you]
So can we prove something now? How about the transitive property of equality:
trans :: Eq a b -> Eq b c -> Eq a c
trans x y =
case x of
Refl -> -- by this match being successful, the compiler now knows that a = b
case y of
Refl -> -- and now b = c and so the compiler knows a = c
Refl -- the compiler knows that this is of type Eq d d, and as it knows a = c, this typechecks as Eq a c
This feels like one hasn’t really proven anything (especially as this mainly relies on the compiler knowing the transitive and symmetric properties), but one gets a similar feeling when proving simple things in logic as well.
So now how might you prove the original proposition (1)? Well let’s imagine we want a type c to be a monoid then we should also prove that $\forall x,y,z:c, m (m x y) z = m x (m y z).$ So we need a way to express m (m x y) z as a type. Strictly speaking this isn’t dependent types (this can be done with DataKinds to promote values and type families instead of functions). But you do need dependent types to have types depend on values. Specifically if you have a type Nat of natural numbers and a type family Vec :: Nat -> * (* is the kind (read type) of all types) of fixed length vectors, you could define a dependently typed function mkVec :: (n::Nat) -> Vec n. Observe how the type of the output depends on the value of the input.
So your law needs to have functions promoted to type level (skipping the questions about how one defines type equality and value equality), as well as dependent types (made up syntax):
class Monoid c where
e :: c
(*) :: c -> c -> c
idl :: (x::c) -> Eq x (e * x)
idr :: (x::c) -> Eq x (x * e)
assoc :: (x::c) -> (y::c) -> (z::c) -> Eq ((x * y) * z) (x * (y * z))
Observe how types tend to become large with dependent types and proofs. In a language missing typeclasses one could put such values into a record.
Final note on the theory of dependent types and how these correspond to the curry Howard isomorphism.
Dependent types can be considered an answer to the question: what types correspond to the propositions $\forall x\in S\quad P(x)$ and $\exists y\in T\quad Q(y)?$
The answer is that you create new ways to make types: the dependent product and the dependent sum (coproduct). The dependent product expresses “for all values $x$ of type $S,$ there is a value of type $P(x).$” A normal product would be a dependent product with $S=2,$ a type inhabited by two values. A dependent product might be written (x:T) -> P x. A dependent sum says “some value $y$ of type $T$, paired with a value of type $Q(y).$” this might be written (y:T) * Q y.
One can think of these as a generalisation of arbitrarily indexed (co)products from Set to general categories, where one might sensibly write e.g. $\prod_\Lambda X(\lambda),$ and sometimes such notation is used in type theory.
So, I'm learning Haskell at the moment, and I would like to confirm or debunk my understanding of monoid.
What I figured out from reading CIS194 course is that monoid is basically "API" for defining custom binary operation on custom set.
Than I went to inform my self some more and I stumbled upon massive ammount of very confusing tutorials trying to clarify the thing, so I'm not so sure anymore.
I have decent mathematical background, but I just got confused from all the metaphors and am looking for clear yes/no answer to my understanding of monoid.
From Wikipedia:
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.
I think your understanding is correct. From a programming perspective, Monoid is an interface with two "methods" that must be implemented.
The only piece that seems to be missing from your description is the "identity", without which you are describing a Semigroup.
Anything that has a "zero" or an "empty" and a way of combining two values can be a Monoid. One thing to note is that it may be possible for a set/type to be made a Monoid in more than one way, for example numbers via addition with identity 0, or multiplication with identity 1.
from Wolfram:
A monoid is a set that is closed under an associative binary operation and has an identity element I in S such that for all a in S, Ia=aI=a.
from Wiki:
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.
so your intuition is more or less right.
You should only keep in mind that it's not defined for a "custom set" in Haskell but a type. The distinction is small (because types in type theory are very similar to sets in set theory) but the types for which you can define a Monoid instance need not be types that represent mathematical sets.
In other words: a type describes the set of all values that are of that type. Monoid is an "interface" that states that any type that claims to adhere to that interface must provide an identity value, a binary operation combining two values of that type, and there are some equations these should satisfy in order for all generic Monoid operations to work as intended (such as the generic summation of a list of monoid values) and not produce illogical/inconsistent results.
Also, note that the existence of an identity element in that set (type) is required for a type to be an instance of the Monoid class.
For example, natural numbers form a Monoid under both addition (identity = 0):
0 + n = n
n + 0 = n
as well as multiplication (identity = 1):
1 * n = n
n * 1 = n
also lists form a monoid under ++ (identity = []):
[] ++ xs = xs
xs ++ [] = xs
also, functions of type a -> a form a monoid under composition (identity = id)
id . f = f
f . id = f
so it's important to keep in mind that Monoid isn't about types that represents sets but about types when viewed as sets, so to say.
as an example of a malconstructed Monoid instance, consider:
import Data.Monoid
newtype MyInt = MyInt Int deriving Show
instance Monoid MyInt where
mempty = MyInt 0
mappend (MyInt a) (MyInt b) = MyInt (a * b)
if you now try to mconcat a list of MyInt values, you'll always get MyInt 0 as the result because the identity value 0 and binary operation * don't play well together:
λ> mconcat [MyInt 1, MyInt 2]
MyInt 0
At a basic level you're right - it's just an API for a binary operator we denote by <>.
However, the value of the monoid concept is in its relationship to other types and classes. Culturally we've decided that <> is the natural way of joining/appending two things of the same type together.
Consider this example:
{-# LANGUAGE OverloadedStrings #-}
import Data.Monoid
greet x = "Hello, " <> x
The function greet is extremely polymorphic - x can be a String, ByteString or Text just to name a few possibilities. Moreover, in each of these cases it does basically what you expect it to - it appends x to the string `"Hello, ".
Additionally, there are lots of algorithms which will work on anything that can be accumulated, and those are good candidates for generalization to a Monoid. For example consider the foldMap function from the Foldable class:
foldMap :: Monoid m => (a -> m) -> t a -> m
Not only does foldMap generalize the idea of folding over a structure, but I can generalize how the accumulation is performed by substituting the right Monoid instance.
If I have a foldable structure t containing Ints, I can use foldMap with the Sum monoid to get the sum of the Ints, or with Product to get the product, etc.
Finally, using <> affords convenience. For instance, there is an abundance of different Set implementations, but for all of them s <> t is always the union of two sets s and t (of the same type). This enables me to write code which is agnostic of the underlying implementation of the set thereby simplifying my code. The same can be said for a lot of other data structures, e.g. sequences, trees, maps, priority queues, etc.
The reason why Set is not a functor is given here. It seems to boil down to the fact that a == b && f a /= f b is possible. So, why doesn't Haskell have as standard an alternative to Eq, something like
class Eq a => StrongEq a where
(===) :: a -> a -> Bool
(/==) :: a -> a -> Bool
x /== y = not (x === y)
x === y = not (x /== y)
for which instances are supposed to obey the laws
∀a,b,f. not (a === b) || (f a === f b)
∀a. a === a
∀a,b. (a === b) == (b === a)
and maybe some others? Then we could have:
instance StrongEq a => Functor (Set a) where
-- ...
Or am I missing something?
Edit: my problem is not “Why are there types without an Eq instance?”, like some of you seem to have answered. It's the opposite: “Why are there instances of Eq that aren't extensionally equal? Why are there too many Eq instances?”, combined with “If a == b does imply extensional equality, why is Set not an instance of Functor?”.
Also, my instance declaration is rubbish (thanks #n.m.). I should have said:
newtype StrongSet a = StrongSet (Set a)
instance Functor StrongSet where
fmap :: (StrongEq a, StrongEq b) => (a -> b) -> StrongSet a -> StrongSet b
fmap (StrongSet s) = StrongSet (map s)
instance StrongEq a => Functor (Set a) where
This makes sense neither in Haskell nor in the grand mathematical/categorical scheme of things, regardless of what StrongEq means.
In Haskell, Functor requires a type constructor of kind * -> *. The arrow reflects the fact that in category theory, a functor is a kind of mapping. [] and (the hypothetical) Set are such type constructors. [a] and Set a have kind * and cannot be functors.
In Haskell, it is hard to define Set such that it can be made into a Functor because equality cannot be sensibly defined for some types no matter what. You cannot compare two things of type Integer->Integer, for example.
Let's suppose there is a function
goedel :: Integer -> Integer -> Integer
goedel x y = -- compute the result of a function with
-- Goedel number x, applied to y
Suppose you have a value s :: Set Integer. What fmap goedel s should look like? How do you eliminate duplicates?
In your typical set theory equality is magically defined for everything, including functions, so Set (or Powerset to be precise) is a functor, no problem with that.
Since I'm not a category theorist, I'll try to write a more concrete/practical explanation (i.e., one I can understand):
The key point is the one that #leftaroundabout made in a comment:
== is supposed to
witness "equivalent by all observable means" (that doesn't necessarily
require a == b must hold only for identical implementations; but
anything you can "officially" do with a and b should again yield
equivalent results. So unAlwaysEq should never be exposed in the first
place). If you can't ensure this for some type, you shouldn't give it
an Eq instance.
That is, there should be no need for your StrongEq because that's what Eq is supposed to be already.
Haskell values are often intended to represent some sort of mathematical or "real-life" value. Many times, this representation is one-to-one. For example, consider the type
data PlatonicSolid = Tetrahedron | Cube |
Octahedron | Dodecahedron | Icosahedron
This type contains exactly one representation of each Platonic solid. We can take advantage of this by adding deriving Eq to the declaration, and it will produce the correct instance.
In many cases, however, the same abstract value may be represented by more than one Haskell value. For example, the red-black trees Node B (Node R Leaf 1 Leaf) 2 Leaf and Node B Leaf 1 (Node R Leaf 2 Leaf) can both represent the set {1,2}. If we added deriving Eq to our declaration, we would get an instance of Eq that distinguishes things we want to be considered the same (outside of the implementation of the set operations).
It's important to make sure that types are only made instances of Eq (and Ord) when appropriate! It's very tempting to make something an instance of Ord just so you can stick it in a data structure that requires ordering, but if the ordering is not truly a total ordering of the abstract values, all manner of breakage may ensue. Unless the documentation absolutely guarantees it, for example, a function called sort :: Ord a => [a] -> [a] may not only be an unstable sort, but may not even produce a list containing all the Haskell values that go into it. sort [Bad 1 "Bob", Bad 1 "James"] can reasonably produce [Bad 1 "Bob", Bad 1 "James"], [Bad 1 "James", Bad 1 "Bob"], [Bad 1 "James", Bad 1 "James"], or [Bad 1 "Bob", Bad 1 "Bob"]. All of these are perfectly legitimate. A function that uses unsafePerformIO in the back room to implement a Las Vegas-style randomized algorithm or to race threads against each other to get an answer from the fastest may even give different results different times, as long as they're == to each other.
tl;dr: Making something an instance of Eq is a way of making a very strong statement to the world; don't make that statement if you don't mean it.
Your second Functor instance also doesn't make any sense. The biggest reason why Set can't be a Functor in Haskell is fmap can't have constraints. Inventing different notions of equality as StrongEq doesn't change the fact that you can't write those constraints on fmap in your Set instance.
fmap in general shouldn't have the constraints you need. It makes perfect sense to have functors of functions, for example (without it the whole notion of using Applicative to apply functions inside a functor breaks down), and functions can't be members of Eq or your StrongEq in general.
fmap can't have extra constraints on only some instances, because of code like this:
fmapBoth :: (Functor f, Functor g) => (a -> b, c -> d) -> (f a, g c) -> (f b, g d)
fmapBoth (h, j) (x, y) = (fmap h x, fmap j y)
This code claims to work regardless of the functors f and g, and regardless of the functions h and j. It has no way of checking whether one of the functors is a special one that has extra constraints on fmap, nor any way of checking whether one of the functions it's applying would violate those constraints.
Saying that Set is a Functor in Haskell, is saying that there is a (lawful) operation fmap :: (a -> b) -> Set a -> Set b, with that exact type. That is precisely what Functor means. fmap :: (Eq a -> Eq b) => (a -> b) -> Set a -> Set b is not an example of such an operation.
It is possible, I understand, to use the ConstraintKinds GHC extendsion to write a different Functor class that permits constraints on the values which vary by Functor (and what you actually need is an Ord constraint, not just Eq). This blog post talks about doing so to make a new Monad class which can have an instance for Set. I've never played around with code like this, so I don't know much more than that the technique exists. It wouldn't help you hand off Sets to existing code that needs Functors, but you should be able to use it instead of Functor in your own code if you wish.
This notion of StrongEq is tough. In general, equality is a place where computer science becomes significantly more rigorous than typical mathematics in the kind of way which makes things challenging.
In particular, typical mathematics likes to talk about objects as though they exist in a set and can be uniquely identified. Computer programs usually deal with types which are not always computable (as a simple counterexample, tell me what the set corresponding to the type data U = U (U -> U) is). This means that it may be undecidable as to whether two values are identifiable.
This becomes an enormous topic in dependently typed languages since typechecking requires identifying like types and dependently typed languages may have arbitrary values in their types and thus need a way to project equality.
So, StrongEq could be defined over a restricted part of Haskell containing only the types which can be decidably compared for equality. We can consider this a category with the arrows as computable functions and then see Set as an endofunctor from types to the type of sets of values of that type. Unfortunately, these restrictions have taken us far from standard Haskell and make defining StrongEq or Functor (Set a) a little less than practical.
While I've seen all kinds of weird things in Haskell sample code - I've never seen an operator plus being overloaded. Is there something special about it?
Let's say I have a type like Pair, and I want to have something like
Pair(2,4) + Pair(1,2) = Pair(3,6)
Can one do it in haskell?
I am just curious, as I know it's possible in Scala in a rather elegant way.
Yes
(+) is part of the Num typeclass, and everyone seems to feel you can't define (*) etc for your type, but I strongly disagree.
newtype Pair a b = Pair (a,b) deriving (Eq,Show)
I think Pair a b would be nicer, or we could even just use the type (a,b) directly, but...
This is very much like the cartesian product of two Monoids, groups, rings or whatever in maths, and there's a standard way of defining a numeric structure on it, which would be sensible to use.
instance (Num a,Num b) => Num (Pair a b) where
Pair (a,b) + Pair (c,d) = Pair (a+c,b+d)
Pair (a,b) * Pair (c,d) = Pair (a*c,b*d)
Pair (a,b) - Pair (c,d) = Pair (a-c,b-d)
abs (Pair (a,b)) = Pair (abs a, abs b)
signum (Pair (a,b)) = Pair (signum a, signum b)
fromInteger i = Pair (fromInteger i, fromInteger i)
Now we've overloaded (+) in an obvious way, but also gone the whole hog and overloaded (*) and all the other Num functions in the same, obvious, familiar way mathematics does it for a pair. I just don't see the problem with this. In fact I think it's good practice.
*Main> Pair (3,4.0) + Pair (7, 10.5)
Pair (10,14.5)
*Main> Pair (3,4.0) + 1 -- *
Pair (4,5.0)
* - Notice that fromInteger is applied to numeric literals like 1, so this was interpreted in that context as Pair (1,1.0) :: Pair Integer Double. This is also quite nice and handy.
Overloading in Haskell is only available using type classes. In this case, (+) belongs to the Num type class, so you would have to provide a Num instance for your type.
However, Num also contains other functions, and a well-behaved instance should implement all of them in a consistent way, which in general will not make sense unless your type represents some kind of number.
So unless that is the case, I would recommend defining a new operator instead. For example,
data Pair a b = Pair a b
deriving Show
infixl 6 |+| -- optional; set same precedence and associativity as +
Pair a b |+| Pair c d = Pair (a+c) (b+d)
You can then use it like any other operator:
> Pair 2 4 |+| Pair 1 2
Pair 3 6
I'll try to come at this question very directly, since you are keen on getting a straight "yes or no" on overloading (+). The answer is yes, you can overload it. There are two ways to overload it directly, without any other changes, and one way to overload it "correctly" which requires creating an instance of Num for your datatype. The correct way is elaborated on in the other answers, so I won't go over it.
Edit: Note that I'm not recommending the way discussed below, just documenting it. You should implement the Num typeclass and not anything I write here.
The first (and most "wrong") way to overload (+) is to simply hide the Prelude.+ function, and define your own function named (+) that operates on your datatype.
import Prelude hiding ((+)) -- hide the autoimport of +
import qualified Prelude as P -- allow us to refer to Prelude functions with a P prefix
data Pair a = Pair (a,a)
(+) :: Num a => Pair a -> Pair a -> Pair a -- redefinition of (+)
(Pair (a,b)) + (Pair (c,d)) = Pair ((P.+) a c,(P.+) b d ) -- using qualified (+) from Prelude
You can see here, we have to go through some contortions to hide the regular definition of (+) from being imported, but we still need a way to refer to it, since it's the only way to do fast machine addition (it's a primitive operation).
The second (slightly less wrong) way to do it is to define your own typeclass that only includes a new operator you name (+). You'll still have to hide the old (+) so haskell doesn't get confused.
import Prelude hiding ((+))
import qualified Prelude as P
data Pair a = Pair (a,a)
class Addable a where
(+) :: a -> a -> a
instance Num a => Addable (Pair a) where
(Pair (a,b)) + (Pair (c,d)) = Pair ((P.+) a c,(P.+) b d )
This is a bit better than the first option because it allows you to use your new (+) for lots of different data types in your code.
But neither of these are recommended, because as you can see, it is very inconvenient to access the regular (+) operator that is defined in the Num typeclass. Even though haskell allows you to redefine (+), all of the Prelude and the libraries are expecting the original (+) definition. Lucky for you, (+) is defined in a typeclass, so you can just make Pair an instance of Num. This is probably the best option, and it is what the other answerers have recommended.
The issue you are running into is that there are possibly too many functions defined in the Num typeclass (+ is one of them). This is just a historical accident, and now the use of Num is so widespread, it would be hard to change it now. Instead of splitting those functionalities out into separate typeclasses for each function (so they can be overridden separately) they are all glommed together. Ideally the Prelude would have an Addable typeclass, and a Subtractable typeclass etc. that allow you to define an instance for one operator at a time without having to implement everything that Num has in it.
Be that as it may, the fact is that you will be fighting an uphill battle if you want to write a new (+) just for your Pair data type. Too much of the other Haskell code depends on the Num typeclass and its current definition.
You might look into the Numeric Prelude if you are looking for a blue-sky reimplementation of the Prelude that tries to avoid some of the mistakes of the current one. You'll notice they've reimplemented the Prelude just as a library, no compiler hacking was necessary, though it's a huge undertaking.
Overloading in Haskell is made possible through type classes. For a good overview, you might want to look at this section in Learn You a Haskell.
The (+) operator is part of the Num type class from the Prelude:
class (Eq a, Show a) => Num a where
(+), (*), (-) :: a -> a -> a
negate :: a -> a
...
So if you'd like a definition for + to work for pairs, you would have to provide an instance.
If you have a type:
data Pair a = Pair (a, a) deriving (Show, Eq)
Then you might have a definition like:
instance Num a => Num (Pair a) where
Pair (x, y) + Pair (u, v) = Pair (x+u, y+v)
...
Punching this into ghci gives us:
*Main> Pair (1, 2) + Pair (3, 4)
Pair (4,6)
However, if you're going to give an instance for +, you should also be providing an instance for all of the other functions in that type class too, which might not always make sense.
If you only want (+) operator rather than all the Num operators, probably you have a Monoid instance, for example Monoid instance of pair is like this:
class (Monoid a, Monoid b) => Monoid (a, b) where
mempty = (mempty, mempty)
(a1, b1) `mappend` (a2, b2) = (a1 `mappend` a2, b1 `mappend` b2)
You can make (++) a alias of mappend, then you can write code like this:
(1,2) ++ (3,4) == (4,6)
("hel", "wor") ++ ("lo", "ld") == ("hello", "world")