Is it a bad idea to define some, but not all methods of a type class in Haskell? - haskell

Want to do this:
data Point = Point Double Double
instance Num Point where
(Point ax ay) + (Point bx by) = Point (ax + bx) (ay + by)
negate (Point ax ay) = Point (negate ax) (negate ay)
Syntastic doesn't like it. Wants me to define methods for *, abs, fromInteger and signum.
What's a recommended way to get me a + and - operator defined for Point without taking on the rest?
Also, I could define some different kinds of scalar and vector multiplication like Double * Point or Point * Point in a cross-product or dot-product sense. But the type system isn't my friend here, right?

And + is the only operator you can think of? Why not +/? Or +.? What's wrong with them?
Yes, leaving members of class unimplemented is Bad Form.
As the last resort, you could define your own typeclass like this
import Prelude hiding ((+))
import qualified Prelude ((+))
class Plus c where (+) :: c -> c -> c
instance Plus Double where (+) = Prelude.(+)
instance Plus Point where Point x1 y1 + Point x2 y2 = Point (x1 + x2) (y1 + y2)

The type system doesn't help here because Num isn't an appropriate type class for 2 dimensional points. From what you describe, you want to have a 2-dimensional vector space over Doubles.
I'd suggest you to have a look at Linear.V2. It provides abstraction of vector spaces, with finely grained hierarchy of type classes. In particular, the vector space part is captured by Additive and the dot product by Metric. Of course, you could also take your definition of Point and just implement these type classes.

I'd say this has a two-sided answer:
It is definitely bad form not to define some of the methods of a class. I'd recommend not to do it.
Many people agree that Haskell's numeric class hierarchy is a wart. But the problem is that it's very difficult to design a proposal that everybody will agree to; you have to find a consensus balance between "very flexible" and "it takes a Math degree to add two Integers."
So you get things like the numeric-prelude package, but not widespread adoption.
However, if what you really care for is to be able to use the + and - as syntax, Haskell does not force you to use the default implementations; you can "opt out" of any of the default things in the Prelude library, like this:
module MyModule where
import Prelude hiding ((+), (-))
import qualified Prelude as P
class MyNum a where
(+) :: a -> a -> a
negate :: a -> a
(-) :: a -> a -> a
a - b = a + negate b
data Point = ...
instance MyNum Point where
a + b = ...
negate a = ...
-- Convenience instances for regular numeric types, so that you can write
-- things like `5 + 7` as well.
instance MyNum Integer where
a + b = a P.+ b
...
In every module where you import this MyModule you have to have the import Prelude hiding ((+), (-)) in order to hide the default Num implementations.
The downside to this is of course that this is highly unexpected for people reading your code.

Related

Haskell: when to use type class and when to use concrete type?

When declaring functions, we could either use type class or concrete type(Am I right?) So I can use "Num" as type indicator or "Int". I'm not sure if "Int" has any definition out of "Num"? Can I define my own concrete type that "inherit" from "Num"?
I ask this question from java/C# inheritance perspective, just begin with Haskell. Would you give some hints?
From the OO perspective, a type class is something like an interface or a trait. In Haskell you generally use concrete types when you require a certain structure of the datatype (because you will unpack it), and a type class when you just want a certain behavior. For example, you can write
f :: Int -> Int -> Int
f x y = x + y
but you do not use the internal structure of Ints here; you just need something that supports addition, which is Num (note that it has a (+) in method list):
f :: Num a => a -> a -> a
f x y = x + y
And yes, of course, you can declare that your own class supports the Num interface. Look at the link above, there is a list of methods called "minimal complete definition". This is what other functions that use it will rely on. In a pinch, you may explicitly set some methods to undefined, but you'll get a runtime error if someone tries to call them:
data MyData = MyData Int Int
instance Num MyData where
(MyData x1 y1) + (MyData x2 y2) = MyData (x1 + y1) (x2 + y2)
(*) = undefined
...
See how you use your knowledge of MyData structure to abstract possible users of your class from it? If they want to add two values of this type together, they do not need to know how the data is arranged inside, they just need to know that it is an instance of Num.

Why does the Num typeclass have an abs method?

I’m thinking about standard libraries (or preludes) for functional languages.
If I have Ord instance for n, then it is trivial to implement abs:
abs n = if n > 0 then n else (-n)
In the case of vector spaces, the absolute value (length) of a vector is very important. But the type doesn’t match because the absolute value of a vector is not a vector: it is a real number.
What was the design rationale behind having abs (or signum) as part of the Num typeclass?
Vectors are not good Num candidates. There's a dedicated class for those.
But Num has many useful instances for which there is no Ord. Basically, (Num, Ord) ≈ Real in Haskell, which hints quite clearly that the obvious non-Ord types are the higher division algebras, foremostly Complex. Here, abs is again not quite perfect because it could return a real number, but as these are a subset of the complex plane returning Complex is not wrong.
Other examples are more abstract types, e.g.
instance (Num n) => Num (a->n) where
f+g = \x -> f x + g x
...
abs f = abs . f
which is not Ord simply because you can't fully evaluate a function, only its return values. (This also prevents an Eq instance, so this is not legal in Haskell98 where Eq is a superclass of Num).
To adress the question in the title: it is a bit disputed whether it was a good idea to put abs in Num. The numeric prelude has it as a complete seperate class, which allows you to make e.g. also vectors an instance of the other num-classes, but not of Absolute.C. The downside is that this results in a much more complicated class hierarchy, which often just isn't worth the effort.

Haskell: Prefer pattern-matching or member access?

Suppose I have a Vector datatype defined as follows:
data Vector = Vector { x :: Double
, y :: Double
, z :: Double
}
Would it be more usual to define functions against it using member access:
vecAddA v w
= Vector (x v + x w)
(y v + y w)
(z v + z w)
Or using pattern-matching:
vecAddB (Vector vx vy vz) (Vector wx wy wz)
= Vector (vx + wx)
(vy + wy)
(vz + wz)
(Apologies if I've got any of the terminology incorrect).
I would normally use pattern matching, especially since you're using all of the constructor's arguments and there aren't a lot of them. Also, In this example it's not an issue, but consider the following:
data Foo = A {a :: Int} | B {b :: String}
fun x = a x + 1
If you use pattern matching to do work on the Foo type, you're safe; it's not possible to access a member that doesn't exist. If you use accessor functions on the other hand, some operations such as calling fun (B "hi!") here will result in a runtime error.
EDIT: while it's of course quite possible to forget to match on some constructor, pattern matching makes it pretty explicit that what happens depends on what constructor is used (you can also tell the compiler to detect and warn you about incomplete patterns) whereas the use of a function hints more that any constructor goes, IMO.
Accessors are best saved for cases when you want to get at just one or a few of the constructor's (potentially many) arguments and you know that it's safe to use them (no risk of using an accessor on the wrong constructor, as in the example.)
Another minor "real world" argument: In general, it isn't a good idea to have such short record entry names, as short names like x and y often end up being used for local variables.
So the "fair" comparison here would be:
vecAddA v w
= Vector (vecX v + vecX w) (vecY v + vecY w) (vecZ v + vecZ w)
vecAddB (Vector vx vy vz) (Vector wx wy wz)
= Vector (vx + wx) (vy + wy) (vz + wz)
I think pattern matching wins out in most cases of this type. Some notable exceptions:
You only need to access (or change!) one or two fields in a larger record
You want to remain flexible to change the record later, such as add more fields.
This is an aesthetic preference since the two are semantically equivalent. Well, I suppose a in a naive compiler the first one would be slower because of the function calls, but I have a hard time believing that would not be optimized away in real life.
Still, with only three elements in the record, since you're using all three anyway and there is presumably some significance to their order, I would use the second one. A second (albeit weaker) argument is that this way you're using the order for both composition and decomposition, rather than a mixture of order and field access.
(Alert, may be wrong. I am still a Haskell newbie, but here's my understanding)
One thing that other people have not mentioned is that pattern matching will make the function "strict" in its argument. (http://www.haskell.org/haskellwiki/Lazy_vs._non-strict)
To choose which pattern to use, the program must reduce the argument to WHNF before calling the function, whereas using the record-syntax accessor function would evaluate the argument inside the function.
I can't really give any concrete examples (still being a newbie) but this can have performance implications where huge piles of "thunks" can build up in recursive, non-strict functions. (That is to mean, for simple functions like extracting values, there should be no performance difference).
(Concrete examples very much welcome)
In short
f (Just x) = x
is actually (using BangPatterns)
f !jx = fromJust jx
Edit: The above is not a good example of strictness, because both are actually strict from definition (f bottom = bottom), just to illustrate what I meant from the performance side.
As kizzx2 pointed out, there is a subtle difference in strictness between vecAddA and vecAddB
vecAddA ⊥ ⊥ = Vector ⊥ ⊥ ⊥
vecAddB ⊥ ⊥ = ⊥
To get the same semantics when using pattern matching, one would have to use irrefutable patterns.
vecAddB' ~(Vector vx vy vz) ~(Vector wx wy wz)
= Vector (vx + wx)
(vy + wy)
(vz + wz)
However, in this case, the fields of Vector should probably be strict to begin with for efficiency:
data Vector = Vector { x :: !Double
, y :: !Double
, z :: !Double
}
With strict fields, vecAddA and vecAddB are semantically equivalent.
Hackage package vect solves both these problems by allowing matching like f (Vec3 x y z) and indexing like:
get1 :: Vec3 -> Float
get1 v = _1 v
Look up HasCoordinates class.
http://hackage.haskell.org/packages/archive/vect/0.4.7/doc/html/Data-Vect-Float-Base.html

Haskell - add typeclass?

Consider the following example:
data Dot = Dot Double Double
data Vector = Vector Double Double
First, i would like to overload + operator for Vector addition. If i wanted to overload equality(==) operator, i would write it like:
instance Eq Vector where ...blahblahblah
But I can't find if there is Add typeclass to make Vector behave like a type with addition operation. I can't even find a complete list of Haskell typeclasses, i know only few from different tutorials. Does such a list exist?
Also, can I overload + operator for adding Vector to Dot(it seems rather logical, doesn't it?).
An easy way to discover information about which typeclass (if any) a function belongs to is to use GHCi:
Prelude> :i (+)
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
...
-- Defined in GHC.Num
infixl 6 +
The operator + in Prelude is defined by the typeclass Num. However as the name suggests, this not only defines addition, but also a lots of other numeric operations (in particular the other arithmetic operators as well as the ability to use numeric literals), so this doesn't fit your use case.
There is no way to overload just + for your type, unless you want to hide Prelude's + operator (which would mean you have to create your own Addable instance for Integer, Double etc. if you still want to be able to use + on numbers).
You can write an instance Num Vector to overload + for vector addition (and the other operators that make sense).
instance Num Vector where
(Vector x1 y1) + (Vector x2 y2) = Vector (x1 + x2) (y1 + y2)
-- and so on
However, note that + has the type Num a => a -> a -> a, i.e. both operands and the result all have to be the same type. This means that you cannot have a Dot plus a Vector be a Dot.
While you can hide Num from the Prelude and specify your own +, this is likely to cause confusion and make it harder to use your code together with regular arithmetic.
I suggest you define your own operator for vector-point addition, for example
(Dot x y) `offsetBy` (Vector dx dy) = Dot (x + dx) (y + dy)
or some variant using symbols if you prefer something shorter.
I sometimes see people defining their own operators that kind of look like ones from the Prelude. Even ++ probably uses that symbol because they wanted something that conveyed the idea of "adding" two lists together, but it didn't make sense for lists to be an instance of Num. So you could use <+> or |+| or something.

How to match rigid types in a type class instance?

I thought I would try modeling some numerical integration on vector quantities of different dimensionality, and figured that type classes were the way to go. I needed something to define the difference between two values and to scale it by a multiplier (to get the derivative), as well as being able to take the distance function.
So far I have:
class Integratable a where
difference :: a -> a -> a
scale :: Num b => a -> b -> a
distance :: Num b => a -> a -> b
data Num a => Vector a = Vector1D a | Vector2D a a
instance Num a => Integratable (Vector a) where
difference (Vector1D x1) (Vector1D x2) = Vector1D (x1 - x2)
scale (Vector1D x) m = Vector1D (x * m)
distance (Vector1D x1) (Vector1D x2) = x1 - x2
difference (Vector2D x1 y1) (Vector2D x2 y2) = Vector2D (x1 - x2) (y1 - y2)
scale (Vector2D x y) m = Vector2D (x * m) (y * m)
distance (Vector2D x1 y1) (Vector2D x2 y2) = sqrt((x1-x2)*(x1-x2)
+ (y1-y2)*(y1-y2))
Unfortunately there are a couple of problems here that I haven't figured out how to resolve. Firstly, the scale function gives errors. GHC can't tell that m and x are compatible since the rigid type restriction Num is given in the instance in one case, and in the Vector type in the other case... Is there a way to specify that x and m are the same type?
(I realize in fact that even if x and m are both Num, they may not be the same Num. How can I specify this? If I can't figure it out with Num, using Double would be fine, but I'd rather keep it general.)
There's a similar problem with distance. Attempting to specify that the return type is Num fails, since it can't tell in the instance definition that a is going to contain values that are compatible with b.
EDIT: It seems to me now that the article on functional dependencies from the HaskellWiki provides the key information in the best form that I can find, so I'd suggest reading that instead of my answer here. I'm not removing the rest of the content, though, as it makes clear (I hope) why FDs are useful here.
Apart from the grouping of definitions issue which Dave pointed out...
(I realize in fact that even if x and m are both Num, they may not be the same Num. How can I specify this? If I can't figure it out with Num, using Double would be fine, but I'd rather keep it general.)
This is the main problem, actually. You can't multiply an Integer by a Float, say. In effect, you need the x and the m in scale to be of the same type.
Also, a similar issue arises with distance, with the additional complication that sqrt needs a Floating argument. So I guess you'd need to mention that somewhere too. (Most likely on the instance, I guess).
EDIT: OK, since sqrt only works on Floating values, you could roll a typeclass for those to upcast Floats to Doubles when needed.
Another idea involves having a typeclass Scalable:
data Vector a = Vector1D a | Vector2D a a deriving (Show)
class Scalable a b | a -> b where
scale :: a -> b -> a
instance (Num a) => Scalable (Vector a) a where
scale (Vector1D x) m = (Vector1D (x * m))
scale (Vector2D x y) m = (Vector2D (x * m) (y * m))
This uses a so-called functional dependency in the definition of Scalable. In fact, trying to remember the syntax for that, I found this link... So I guess you should disregard my inferior attempt at being helpful and read the quality info there. ;-)
I think you should be able to use this to solve your original problem.
To fix the second error, I think you need to reorder your definitions in the instance declaration. First have the two equations for difference, then the equations for scale, then both for distance.

Resources