Why does the Num typeclass have an abs method? - haskell

I’m thinking about standard libraries (or preludes) for functional languages.
If I have Ord instance for n, then it is trivial to implement abs:
abs n = if n > 0 then n else (-n)
In the case of vector spaces, the absolute value (length) of a vector is very important. But the type doesn’t match because the absolute value of a vector is not a vector: it is a real number.
What was the design rationale behind having abs (or signum) as part of the Num typeclass?

Vectors are not good Num candidates. There's a dedicated class for those.
But Num has many useful instances for which there is no Ord. Basically, (Num, Ord) ≈ Real in Haskell, which hints quite clearly that the obvious non-Ord types are the higher division algebras, foremostly Complex. Here, abs is again not quite perfect because it could return a real number, but as these are a subset of the complex plane returning Complex is not wrong.
Other examples are more abstract types, e.g.
instance (Num n) => Num (a->n) where
f+g = \x -> f x + g x
...
abs f = abs . f
which is not Ord simply because you can't fully evaluate a function, only its return values. (This also prevents an Eq instance, so this is not legal in Haskell98 where Eq is a superclass of Num).
To adress the question in the title: it is a bit disputed whether it was a good idea to put abs in Num. The numeric prelude has it as a complete seperate class, which allows you to make e.g. also vectors an instance of the other num-classes, but not of Absolute.C. The downside is that this results in a much more complicated class hierarchy, which often just isn't worth the effort.

Related

(ML) Modules vs (Haskell) Type Classes

According to Harper (https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/), it seems that Type Classes simply do not offer the same level of abstraction that Modules offer and I'm having a hard time exactly figuring out why. And there are no examples in that link, so it's hard for me to see the key differences. There are also other papers on how to translate between Modules and Type Classes (http://www.cse.unsw.edu.au/~chak/papers/modules-classes.pdf), but this doesn't really have anything to do with the implementation in the programmer's perspective (it just says that there isn't something one can do that the other can't emulate).
Specifically, in the first link:
The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference.
I don't understand either. A type can implement a type class in more than 1 way in ML? How would you have the integers ordered by divisibility by example without creating a new type? In Haskell, you would have to do something like use data and have the instance Ord to offer an alternative ordering.
And the second one, aren't the two are distinct in Haskell?
Specifying "when such a specification should be used during type inference" can be done by something like this:
blah :: BlahType b => ...
where BlahType is the class being used during the type inference and NOT the implementing class. Whereas, "how a type implements a type class" is done using instance.
Can some one explain what the link is really trying to say? I'm just not quite understanding why Modules would be less restrictive than Type Classes.
To understand what the article is saying, take a moment to consider the Monoid typeclass in Haskell. A monoid is any type, T, which has a function mappend :: T -> T -> T and identity element mempty :: T for which the following holds.
a `mappend` (b `mappend` c) == (a `mappend` b) `mappend` c
a `mappend` mempty == mempty `mappend` a == a
There are many Haskell types which fit this definition. One example that springs immediately to mind are the integers, for which we can define the following.
instance Monoid Integer where
mappend = (+)
mempty = 0
You can confirm that all of the requirements hold.
a + (b + c) == (a + b) + c
a + 0 == 0 + a == a
Indeed, the those conditions hold for all numbers over addition, so we can define the following as well.
instance Num a => Monoid a where
mappend = (+)
mempty = 0
So now, in GHCi, we can do the following.
> mappend 3 5
8
> mempty
0
Particularly observant readers (or those with a background in mathemetics) will probably have noticed by now that we can also define a Monoid instance for numbers over multiplication.
instance Num a => Monoid a where
mappend = (*)
mempty = 1
a * (b * c) == (a * b) * c
a * 1 == 1 * a == a
But now the compiler encounters a problem. Which definiton of mappend should it use for numbers? Does mappend 3 5 equal 8 or 15? There is no way for it to decide. This is why Haskell does not allow multiple instances of a single typeclass. However, the issue still stands. Which Monoid instance of Num should we use? Both are perfectly valid and make sense for certain circumstances. The solution is to use neither. If you look Monoid in Hackage, you will see that there is no Monoid instance of Num, or Integer, Int, Float, or Double for that matter. Instead, there are Monoid instances of Sum and Product. Sum and Product are defined as follows.
newtype Sum a = Sum { getSum :: a }
newtype Product a = Product { getProduct :: a }
instance Num a => Monoid (Sum a) where
mappend (Sum a) (Sum b) = Sum $ a + b
mempty = Sum 0
instance Num a => Monoid (Product a) where
mappend (Product a) (Product b) = Product $ a * b
mempty = Product 1
Now, if you want to use a number as a Monoid you have to wrap it in either a Sum or Product type. Which type you use determines which Monoid instance is used. This is the essence of what the article was trying to describe. There is no system built into Haskell's typeclass system which allows you to choose between multiple intances. Instead you have to jump through hoops by wrapping and unwrapping them in skeleton types. Now whether or not you consider this a problem is a large part of what determines whether you prefer Haskell or ML.
ML gets around this by allowing multiple "instances" of the same class and type to be defined in different modules. Then, which module you import determines which "instance" you use. (Strictly speaking, ML doesn't have classes and instances, but it does have signatures and structures, which can act almost the same. For amore in depth comparison, read this paper).

Is my understanding of monoid valid?

So, I'm learning Haskell at the moment, and I would like to confirm or debunk my understanding of monoid.
What I figured out from reading CIS194 course is that monoid is basically "API" for defining custom binary operation on custom set.
Than I went to inform my self some more and I stumbled upon massive ammount of very confusing tutorials trying to clarify the thing, so I'm not so sure anymore.
I have decent mathematical background, but I just got confused from all the metaphors and am looking for clear yes/no answer to my understanding of monoid.
From Wikipedia:
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.
I think your understanding is correct. From a programming perspective, Monoid is an interface with two "methods" that must be implemented.
The only piece that seems to be missing from your description is the "identity", without which you are describing a Semigroup.
Anything that has a "zero" or an "empty" and a way of combining two values can be a Monoid. One thing to note is that it may be possible for a set/type to be made a Monoid in more than one way, for example numbers via addition with identity 0, or multiplication with identity 1.
from Wolfram:
A monoid is a set that is closed under an associative binary operation and has an identity element I in S such that for all a in S, Ia=aI=a.
from Wiki:
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.
so your intuition is more or less right.
You should only keep in mind that it's not defined for a "custom set" in Haskell but a type. The distinction is small (because types in type theory are very similar to sets in set theory) but the types for which you can define a Monoid instance need not be types that represent mathematical sets.
In other words: a type describes the set of all values that are of that type. Monoid is an "interface" that states that any type that claims to adhere to that interface must provide an identity value, a binary operation combining two values of that type, and there are some equations these should satisfy in order for all generic Monoid operations to work as intended (such as the generic summation of a list of monoid values) and not produce illogical/inconsistent results.
Also, note that the existence of an identity element in that set (type) is required for a type to be an instance of the Monoid class.
For example, natural numbers form a Monoid under both addition (identity = 0):
0 + n = n
n + 0 = n
as well as multiplication (identity = 1):
1 * n = n
n * 1 = n
also lists form a monoid under ++ (identity = []):
[] ++ xs = xs
xs ++ [] = xs
also, functions of type a -> a form a monoid under composition (identity = id)
id . f = f
f . id = f
so it's important to keep in mind that Monoid isn't about types that represents sets but about types when viewed as sets, so to say.
as an example of a malconstructed Monoid instance, consider:
import Data.Monoid
newtype MyInt = MyInt Int deriving Show
instance Monoid MyInt where
mempty = MyInt 0
mappend (MyInt a) (MyInt b) = MyInt (a * b)
if you now try to mconcat a list of MyInt values, you'll always get MyInt 0 as the result because the identity value 0 and binary operation * don't play well together:
λ> mconcat [MyInt 1, MyInt 2]
MyInt 0
At a basic level you're right - it's just an API for a binary operator we denote by <>.
However, the value of the monoid concept is in its relationship to other types and classes. Culturally we've decided that <> is the natural way of joining/appending two things of the same type together.
Consider this example:
{-# LANGUAGE OverloadedStrings #-}
import Data.Monoid
greet x = "Hello, " <> x
The function greet is extremely polymorphic - x can be a String, ByteString or Text just to name a few possibilities. Moreover, in each of these cases it does basically what you expect it to - it appends x to the string `"Hello, ".
Additionally, there are lots of algorithms which will work on anything that can be accumulated, and those are good candidates for generalization to a Monoid. For example consider the foldMap function from the Foldable class:
foldMap :: Monoid m => (a -> m) -> t a -> m
Not only does foldMap generalize the idea of folding over a structure, but I can generalize how the accumulation is performed by substituting the right Monoid instance.
If I have a foldable structure t containing Ints, I can use foldMap with the Sum monoid to get the sum of the Ints, or with Product to get the product, etc.
Finally, using <> affords convenience. For instance, there is an abundance of different Set implementations, but for all of them s <> t is always the union of two sets s and t (of the same type). This enables me to write code which is agnostic of the underlying implementation of the set thereby simplifying my code. The same can be said for a lot of other data structures, e.g. sequences, trees, maps, priority queues, etc.

Haskell: do standard libraries assume Eq and Ord are compatible?

This is a followup question to Inconsistent Eq and Ord instances?.
The question there is essentially: when declaring Eq and Ord instances for a type, must one ensure that compare x y returns EQ if and only if x == y returns True? Is it dangerous to create instances that break this assumption? It seems like a natural law one might assume, but it doesn’t appear to be explicitly stated in the Prelude, unlike e.g. the monad or functor laws.
The basic response was: it is a bit dangerous to do this, since libraries may assume that this law holds.
My question, now, is: do any of the standard libraries (in particular, Set or Map) make this assumption? Is it dangerous to have a type with incompatible Eq and Ord, so long as I am only relying on the standard libraries supplied with GHC? (If big-list questions were still acceptable, I would be asking: which commonly used libraries assume this law?)
Edit. My use-case is similar to that of the original question. I have a type with a custom instance of Eq, that I use quite a bit. The only reason I want Ord is so that I can use it as the domain of a Map; I don’t care about the specific order, and will never use it explicitly in code. So if I can use the derived instance of Ord, then my life will be easier and my code clearer.
The definition of Ord itself in the standard prelude requires there already be an Eq instance:
class (Eq a) => Ord a where
...
So it would be just as wrong to violate
x == y = compare x y == EQ
x /= y = compare x y /= EQ
As it would be to violate (from the default definitions for these operators in Ord).
x <= y = compare x y /= GT
x < y = compare x y == LT
x >= y = compare x y /= LT
x > y = compare x y == GT
Edit: Use in libraries
I would be quite surprised if standard libraries didn't make use of Ord's == and /= operators. The specific purpose operators (==, /=, <=, <, >=, >) are frequently more convenient than compare, so I'd expect to see them used in code for maps or filters.
You can see == being used in guards on keys in Data.Map in fromAscListWithKey. This specific function only calls out for the Eq class, but if the key is also an Ord instance, Ord's compare will be used for other functions of the resulting Map, which is an assumption that Eq's == is the same as Ord's compare and testing for EQ.
As a library programmer, I wouldn't be surprised if any of the special purpose operators outperformed compare for the specific purpose. After all, that's why they are part of the Eq and Ord classes instead of being defined as polymorphic for all Eq or Ord instances. I might make a point of using them even when compare is more convenient. If I did, I'd probably define something like:
compareOp :: (Ord a) => Ordering -> Bool -> a -> a -> Bool
compareOp EQ True = (==)
compareOp EQ False = (/=)
compareOp LT True = (<)
compareOp LT False = (>=)
compareOp GT True = (>)
compareOp GT False = (<=)
To extend Cirdec's answer, typeclass instances should only be made if the operation being defined is somehow canonical. If there is a reasonable Eq which doesn't extend to a reasonable Ord, then it's best practice to pick either the other Eq or to not define an Ord. It's easy enough to create a non-polymorphic function for the "other" equality.
A great example of this tension is the potential Monoid instance
instance Monoid Int where
mzero = 0
mappend = (+)
which contests with the other "obvious" Monoid instance
instance Monoid Int where
mzero = 1
mappend = (*)
In this case the chosen path was to instantiate neither because it's not clear that one is "canonical" over the other. This typically conforms best to a user's expectation and which prevent bugs.
I've read through this and your original question, so I will address your general problem....
You want this-
Map BigThing OtherType
and this-
(==)::BigThing->BigThing->Bool
One of these cases has to be exact, the other case should ignore some of its data, for performance reasons. (it was (==) that needed to be exact in the first question, but it looks like you might be addressing the reverse in this question.... Same answer either way).
For instance, you want the map to only store the result based on some label, like a
`name::BigThing->String`
but (==) should do a deep compare. One way to do this would be to define incompatible compare and (==) functions. However....
in this case, this is unnecessary. Why not just instead use the map
Map String OtherThing
and do a lookup like this-
lookup (name obj) theMap
It is pretty rare to index directly on very large document data....

What laws are the standard Haskell type classes expected to uphold?

It's well-known that Monad instances ought to follow the Monad laws. It's perhaps less well-known that Functor instances ought to follow the Functor laws. Nevertheless, I would feel fairly confident writing a GHC rewrite rule that optimizes fmap id == id.
What other standard classes have implicit laws? Does (==) have to be a true equivalence relation? Does Ord have to form a partial order? A total order? Can we at least assume it's transitive? Anti-symmetric?
These last few don't appear to be specified in the Haskell 2010 report nor would I feel confident writing rewrite rules taking advantage of them. Are there any common libraries which do, however? How pathological an instance can one write confidently?
Finally, assuming that there is a boundary for how pathological such an instance can be is there a standard, comprehensive resource for the laws that each type class instance must uphold?
As an example, how much trouble would I be in to define
newtype Doh = Doh Bool
instance Eq Doh where a == (Doh b) = b
is it merely hard to understand or will the compiler optimize incorrectly anywhere?
The Haskell report mentions laws for:
Functor (e.g. fmap id == id)
Monad (e.g. m >>= return == m)
Integral (e.g. (x ‘quot‘ y)*y + (x ‘rem‘ y) == x)
Num (abs x * signum x == x)
Show (showsPrec d x r ++ s == showsPrec d x (r ++ s))
Ix (e.g. inRange (l,u) i == elem i (range (l,u)))
That is all I could find. Specifically, the section about Eq (6.3.1) mentions no laws, neither does the next one about Ord.
My own view of what the laws "ought to be" is not upheld by all standard instances, but I think
Eq should be an equivalence relation.
Ord should be a total order
Num should be a ring, with fromInteger a ring homomorphism, and abs/signum behaving in the obvious ways.
Much code will assume these "laws" to hold even though they don't. This is not a Haskell specific problem, early C allowed compiler to reorder arithmetic according to algebraic laws, and most compilers have an option to do reenable such optimization even though they are not permitted by the current standard and may change your programs results.
It used to be that breaking the Ix laws would let you do just about anything. These days I think they've fixed that. More info here: Does anyone know (or remember) how breaking class laws could cause problems in GHC?

What is this abstract data type called?

I'm writing Haskell, but this could be applied to any OO or functional language with a concept of ADT. I'll give the template in Haskell, ignoring the fact that the arithmetic operators are already taken:
class Thing a where
(+) :: a -> a -> a
(-) :: a -> a -> a
x - y = x + negate y
(*) :: (RealFrac b) => a -> b -> a
negate :: a -> a
negate x = x * (-1)
Basically these are things that can be added and subtracted and also multiplied by real fractional values. One example might be a simple list of numbers: addition and subtraction are pairwise (in Haskell, "(+) = zipWith (+)"), and multiplication by a real multiplies every item in the list by the same amount. I've come across enough other examples to want to define it as a class, but I don't know exactly what to call it.
In Haskell its usually a monoid provided there is some kind of zero value.
Is this some known kind of object in the zoo of algebraic types? I've looked through rings, semirings, nearsemirings, groups etc without finding it.
This is a vector space: http://en.wikipedia.org/wiki/Vector_space. You have addition and scalar multiplication.

Resources