Haskell - add typeclass? - haskell

Consider the following example:
data Dot = Dot Double Double
data Vector = Vector Double Double
First, i would like to overload + operator for Vector addition. If i wanted to overload equality(==) operator, i would write it like:
instance Eq Vector where ...blahblahblah
But I can't find if there is Add typeclass to make Vector behave like a type with addition operation. I can't even find a complete list of Haskell typeclasses, i know only few from different tutorials. Does such a list exist?
Also, can I overload + operator for adding Vector to Dot(it seems rather logical, doesn't it?).

An easy way to discover information about which typeclass (if any) a function belongs to is to use GHCi:
Prelude> :i (+)
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
...
-- Defined in GHC.Num
infixl 6 +

The operator + in Prelude is defined by the typeclass Num. However as the name suggests, this not only defines addition, but also a lots of other numeric operations (in particular the other arithmetic operators as well as the ability to use numeric literals), so this doesn't fit your use case.
There is no way to overload just + for your type, unless you want to hide Prelude's + operator (which would mean you have to create your own Addable instance for Integer, Double etc. if you still want to be able to use + on numbers).

You can write an instance Num Vector to overload + for vector addition (and the other operators that make sense).
instance Num Vector where
(Vector x1 y1) + (Vector x2 y2) = Vector (x1 + x2) (y1 + y2)
-- and so on
However, note that + has the type Num a => a -> a -> a, i.e. both operands and the result all have to be the same type. This means that you cannot have a Dot plus a Vector be a Dot.
While you can hide Num from the Prelude and specify your own +, this is likely to cause confusion and make it harder to use your code together with regular arithmetic.
I suggest you define your own operator for vector-point addition, for example
(Dot x y) `offsetBy` (Vector dx dy) = Dot (x + dx) (y + dy)
or some variant using symbols if you prefer something shorter.

I sometimes see people defining their own operators that kind of look like ones from the Prelude. Even ++ probably uses that symbol because they wanted something that conveyed the idea of "adding" two lists together, but it didn't make sense for lists to be an instance of Num. So you could use <+> or |+| or something.

Related

Is it a bad idea to define some, but not all methods of a type class in Haskell?

Want to do this:
data Point = Point Double Double
instance Num Point where
(Point ax ay) + (Point bx by) = Point (ax + bx) (ay + by)
negate (Point ax ay) = Point (negate ax) (negate ay)
Syntastic doesn't like it. Wants me to define methods for *, abs, fromInteger and signum.
What's a recommended way to get me a + and - operator defined for Point without taking on the rest?
Also, I could define some different kinds of scalar and vector multiplication like Double * Point or Point * Point in a cross-product or dot-product sense. But the type system isn't my friend here, right?
And + is the only operator you can think of? Why not +/? Or +.? What's wrong with them?
Yes, leaving members of class unimplemented is Bad Form.
As the last resort, you could define your own typeclass like this
import Prelude hiding ((+))
import qualified Prelude ((+))
class Plus c where (+) :: c -> c -> c
instance Plus Double where (+) = Prelude.(+)
instance Plus Point where Point x1 y1 + Point x2 y2 = Point (x1 + x2) (y1 + y2)
The type system doesn't help here because Num isn't an appropriate type class for 2 dimensional points. From what you describe, you want to have a 2-dimensional vector space over Doubles.
I'd suggest you to have a look at Linear.V2. It provides abstraction of vector spaces, with finely grained hierarchy of type classes. In particular, the vector space part is captured by Additive and the dot product by Metric. Of course, you could also take your definition of Point and just implement these type classes.
I'd say this has a two-sided answer:
It is definitely bad form not to define some of the methods of a class. I'd recommend not to do it.
Many people agree that Haskell's numeric class hierarchy is a wart. But the problem is that it's very difficult to design a proposal that everybody will agree to; you have to find a consensus balance between "very flexible" and "it takes a Math degree to add two Integers."
So you get things like the numeric-prelude package, but not widespread adoption.
However, if what you really care for is to be able to use the + and - as syntax, Haskell does not force you to use the default implementations; you can "opt out" of any of the default things in the Prelude library, like this:
module MyModule where
import Prelude hiding ((+), (-))
import qualified Prelude as P
class MyNum a where
(+) :: a -> a -> a
negate :: a -> a
(-) :: a -> a -> a
a - b = a + negate b
data Point = ...
instance MyNum Point where
a + b = ...
negate a = ...
-- Convenience instances for regular numeric types, so that you can write
-- things like `5 + 7` as well.
instance MyNum Integer where
a + b = a P.+ b
...
In every module where you import this MyModule you have to have the import Prelude hiding ((+), (-)) in order to hide the default Num implementations.
The downside to this is of course that this is highly unexpected for people reading your code.

Why does the Num typeclass have an abs method?

I’m thinking about standard libraries (or preludes) for functional languages.
If I have Ord instance for n, then it is trivial to implement abs:
abs n = if n > 0 then n else (-n)
In the case of vector spaces, the absolute value (length) of a vector is very important. But the type doesn’t match because the absolute value of a vector is not a vector: it is a real number.
What was the design rationale behind having abs (or signum) as part of the Num typeclass?
Vectors are not good Num candidates. There's a dedicated class for those.
But Num has many useful instances for which there is no Ord. Basically, (Num, Ord) ≈ Real in Haskell, which hints quite clearly that the obvious non-Ord types are the higher division algebras, foremostly Complex. Here, abs is again not quite perfect because it could return a real number, but as these are a subset of the complex plane returning Complex is not wrong.
Other examples are more abstract types, e.g.
instance (Num n) => Num (a->n) where
f+g = \x -> f x + g x
...
abs f = abs . f
which is not Ord simply because you can't fully evaluate a function, only its return values. (This also prevents an Eq instance, so this is not legal in Haskell98 where Eq is a superclass of Num).
To adress the question in the title: it is a bit disputed whether it was a good idea to put abs in Num. The numeric prelude has it as a complete seperate class, which allows you to make e.g. also vectors an instance of the other num-classes, but not of Absolute.C. The downside is that this results in a much more complicated class hierarchy, which often just isn't worth the effort.

How does Haskell actually define the + function?

While reading Real world Haskell I came up with this note:
ghci> :info (+)
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
...
-- Defined in GHC.Num
infixl 6 +
But how can Haskell define + as a non-native function? At some level you have to say that 2 + 3 will become assembler i.e. machine code.
The + function is overloaded and for some types, like Int and Double the definition of + is something like
instance Num Int where
x + y = primAddInt x y
where primAddInt is a function the compiler knows about and will generate machine code for.
The details of how this looks and works depends on the Haskell implementation you're looking at.
It is in fact possible to define numbers without ANY native primitives. There are many ways, but the simplest is:
data Peano = Z | S Peano
Then you can define instance Num for this type using pattern-matching. The second common representation of numbers is so called Church encoding using only functions (all numbers will be represented by some obscure functions, and + will 'add' two functions together to form third one).
Very interesting encodings are possible indeed. For example, you can represent arbitrary precision reals in [0,1) using sequences of bits:
data RealReal = RealReal Bool RealReal | RealEnd
In GHC of course it is defined in a machine-specific way by using either primitives or FFI.

Overloading (+)

I am trying to define a Vector3 data type in Haskell and allow the (+) operator to be used on it. I tried the following:
data Vector3 = Vector3 Double Double Double
Vector3 x y z + Vector3 x' y' z' = Vector3 (x+x') (y+y') (z+z')
But ghci complains about ambiguous occurrence of (+). I do not understand why the occurrence is ambiguous; surely the type checker can infer that x, x', y etc have type Double and hence the correct operator to use for them is Prelude.+?
I know that I could make Vector3 an instance of the Num typeclass, but that is too restrictive for me; I do not want to define multiplication of a vector by another vector.
The only way to overload a name in Haskell is to use type classes, so you have three choices:
Make Vector an instance of Num and just have multiplication return an error.
Use something like the numeric prelude, which defines more fine-grained numeric classes.
Pick some other name like .+. or something similar for vector addition.
I know that I could make Vector3 an instance of the Num typeclass, but that is too restrictive for me; I do not want to define multiplication of a vector by another vector.
That would be the easiest solution, though. You can define multiplication as
(*) = error "vector multiplication not implemented"
Think of the vector operations that you would get for free!

Haskell: Prefer pattern-matching or member access?

Suppose I have a Vector datatype defined as follows:
data Vector = Vector { x :: Double
, y :: Double
, z :: Double
}
Would it be more usual to define functions against it using member access:
vecAddA v w
= Vector (x v + x w)
(y v + y w)
(z v + z w)
Or using pattern-matching:
vecAddB (Vector vx vy vz) (Vector wx wy wz)
= Vector (vx + wx)
(vy + wy)
(vz + wz)
(Apologies if I've got any of the terminology incorrect).
I would normally use pattern matching, especially since you're using all of the constructor's arguments and there aren't a lot of them. Also, In this example it's not an issue, but consider the following:
data Foo = A {a :: Int} | B {b :: String}
fun x = a x + 1
If you use pattern matching to do work on the Foo type, you're safe; it's not possible to access a member that doesn't exist. If you use accessor functions on the other hand, some operations such as calling fun (B "hi!") here will result in a runtime error.
EDIT: while it's of course quite possible to forget to match on some constructor, pattern matching makes it pretty explicit that what happens depends on what constructor is used (you can also tell the compiler to detect and warn you about incomplete patterns) whereas the use of a function hints more that any constructor goes, IMO.
Accessors are best saved for cases when you want to get at just one or a few of the constructor's (potentially many) arguments and you know that it's safe to use them (no risk of using an accessor on the wrong constructor, as in the example.)
Another minor "real world" argument: In general, it isn't a good idea to have such short record entry names, as short names like x and y often end up being used for local variables.
So the "fair" comparison here would be:
vecAddA v w
= Vector (vecX v + vecX w) (vecY v + vecY w) (vecZ v + vecZ w)
vecAddB (Vector vx vy vz) (Vector wx wy wz)
= Vector (vx + wx) (vy + wy) (vz + wz)
I think pattern matching wins out in most cases of this type. Some notable exceptions:
You only need to access (or change!) one or two fields in a larger record
You want to remain flexible to change the record later, such as add more fields.
This is an aesthetic preference since the two are semantically equivalent. Well, I suppose a in a naive compiler the first one would be slower because of the function calls, but I have a hard time believing that would not be optimized away in real life.
Still, with only three elements in the record, since you're using all three anyway and there is presumably some significance to their order, I would use the second one. A second (albeit weaker) argument is that this way you're using the order for both composition and decomposition, rather than a mixture of order and field access.
(Alert, may be wrong. I am still a Haskell newbie, but here's my understanding)
One thing that other people have not mentioned is that pattern matching will make the function "strict" in its argument. (http://www.haskell.org/haskellwiki/Lazy_vs._non-strict)
To choose which pattern to use, the program must reduce the argument to WHNF before calling the function, whereas using the record-syntax accessor function would evaluate the argument inside the function.
I can't really give any concrete examples (still being a newbie) but this can have performance implications where huge piles of "thunks" can build up in recursive, non-strict functions. (That is to mean, for simple functions like extracting values, there should be no performance difference).
(Concrete examples very much welcome)
In short
f (Just x) = x
is actually (using BangPatterns)
f !jx = fromJust jx
Edit: The above is not a good example of strictness, because both are actually strict from definition (f bottom = bottom), just to illustrate what I meant from the performance side.
As kizzx2 pointed out, there is a subtle difference in strictness between vecAddA and vecAddB
vecAddA ⊥ ⊥ = Vector ⊥ ⊥ ⊥
vecAddB ⊥ ⊥ = ⊥
To get the same semantics when using pattern matching, one would have to use irrefutable patterns.
vecAddB' ~(Vector vx vy vz) ~(Vector wx wy wz)
= Vector (vx + wx)
(vy + wy)
(vz + wz)
However, in this case, the fields of Vector should probably be strict to begin with for efficiency:
data Vector = Vector { x :: !Double
, y :: !Double
, z :: !Double
}
With strict fields, vecAddA and vecAddB are semantically equivalent.
Hackage package vect solves both these problems by allowing matching like f (Vec3 x y z) and indexing like:
get1 :: Vec3 -> Float
get1 v = _1 v
Look up HasCoordinates class.
http://hackage.haskell.org/packages/archive/vect/0.4.7/doc/html/Data-Vect-Float-Base.html

Resources