In Haskell how can I override the (==) and (/=) operators for a type class? - haskell

Say I have something like this
class Circle c where
x :: c -> Float
y :: c -> Float
radius :: c -> Float
data Location = Location { locationX :: Float
, locationY :: Float
} deriving (Show, Eq)
data Blob = Location { blobX :: Float
, blobY :: Float
, blobRadius :: Float,
, blobRating :: Int
} deriving (Show, Eq)
instance Circle Location where
x = locationX
y = locationY
radius = pure 0
instance Circle Blob where
x = blobX
y = blobY
radius = blobRadius
Say for example I want Circle types to be equal if their x and y points are equal. How can I compare instances of the type class with the (==) and (/=) operators. I know I can do something like this, but is it possible to overload the operators?
equal :: Circle a => Circle b => a -> b -> Bool
equal a b = (x a == x b && y a == y b)
I want to be able to compare with
(Location 5.0 5.0) == (Blob 5.0 5.0 ... ) should give me True

Zeroth, some standard imports:
import Data.Function (on)
import Control.Arrow ((&&&))
First, this is not a good idea. a==b should only be true if a and b are (for all purposes relevant to the user) interchangeable – that's clearly not the case for two circles which merely happen to share the same center point!
Second, it's probably not a good idea to make Circle a typeclass in the first place. A typeclass only makes sense when you want to abstract over something that can't directly be expressed with just a parameter. But if you just want to attach different “payloads” to points in space, a more sensible approach might be to define something like
data Located a = Located {x,y :: ℝ, payload :: a}
If, as seems to be the case, you actually want to allow different instances of Circle to coexist and be comparable at runtime, then a typeclass is entirely the wrong choice. That would be an OO class. Haskell doesn't have any built-in notion of those, but you could just use
data Blob = Blob
{ x,y :: ℝ
, radius :: ℝ
, rating :: Maybe Int }
and no other types.
https://lukepalmer.wordpress.com/2010/01/24/haskell-antipattern-existential-typeclass/
Third, the instance that you asked for can, theoretically speaking, be defined as
instance (Circle a) => Eq a where
(==) = (==)`on`(x &&& y)
But this would be a truely horrible idea. It would be a catch-all instance: whenever you compare anything, the compiler would check “is it of the form a?” (literally anything is of that form) “oh great, then said instance tells me how to compare this.” Only later would it look at the Circle requirement.
The correct solution is to not define any such Eq instance at all. Your types already have Eq instances individually, that should generally be the right thing to use – no need to express it through the Circle class at all, just give any function which needs to do such comparisons the constraint (Circle a, Eq a) => ....
Of course, these instances would then not just compare the location but the entire data, which, as I said, is a good thing. If you actually want to compare only part of the structure, well, make that explicit! Use not == itself, but extract the relevant parts and compare those. A useful helper for this could be
location :: Circle a => a -> Location
location c = Location (x c) (y c)
...then you can, for any Circle type, simply write (==)`on`location instead of (==), to disregard any other information except the location. Or write out (==)`on`(x &&& y) directly, which can easily be tweaked to other situations.

Two circles that share a common center aren't necessarily equal, but they are concentric; that's what you should write a function to check.
concentric :: (Circle a, Circle b) => a -> b -> Bool
concentric c1 c2 = x c1 == x c2 && y c1 == y c2

Related

Is it bad form to make new types/datas for clarity? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I would like to know if it is bad form to do something like this:
data Alignment = LeftAl | CenterAl | RightAl
type Delimiter = Char
type Width = Int
setW :: Width -> Alignment -> Delimiter -> String -> String
Rather than something like this:
setW :: Int -> Char -> Char -> String -> String
I do know that remaking those types effectively does nothing but take up a few lines in exchange for clearer code. However, if I use the type Delimiter for multiple functions, this would be much clearer to someone importing this module, or reading the code later.
I am relatively new to Haskell so I do not know what is good practice for this type of stuff. If this is not a good idea, or there is something that would improve clarity that is preferred, what would that be?
You're using type aliases, they only slightly help with code readability. However, it's better to use newtype instead of type for better type-safety. Like this:
data Alignment = LeftAl | CenterAl | RightAl
newtype Delimiter = Delimiter { unDelimiter :: Char }
newtype Width = Width { unWidth :: Int }
setW :: Width -> Alignment -> Delimiter -> String -> String
You will deal with extra wrapping and unwrapping of newtype. But the code will be more robust against further refactorings. This style guide suggests to use type only for specializing polymorphic types.
I wouldn't consider that bad form, but clearly, I don't speak for the Haskell community at large. The language feature exists, as far as I can tell, for that particular purpose: to make the code easier to read.
One can find examples of the use of type aliases in various 'core' libraries. For example, the Read class defines this method:
readList :: ReadS [a]
The ReadS type is just a type alias
type ReadS a = String -> [(a, String)]
Another example is the Forest type in Data.Tree:
type Forest a = [Tree a]
As Shersh points out, you can also wrap new types in newtype declarations. That's often useful if you need to somehow constrain the original type in some way (e.g. with smart constructors) or if you want to add functionality to a type without creating orphan instances (a typical example is to define QuickCheck Arbitrary instances to types that don't otherwise come with such an instance).
Using newtype—which creates a new type with the same representation as the underlying type but not substitutable with it— is considered good form. It's a cheap way to avoid primitive obsession, and it's especially useful for Haskell because in Haskell the names of function arguments are not visible in the signature.
Newtypes can also be a place on which to hang useful typeclass instances.
Given that newtypes are ubiquitous in Haskell, over time the language has gained some tools and idioms to manipulate them:
Coercible A "magical" typeclass that simplifies conversions between newtypes and their underlying types, when the newtype constructor is in scope. Often useful to avoid boilerplate in function implementations.
ghci> coerce (Sum (5::Int)) :: Int
ghci> coerce [Sum (5::Int)] :: [Int]
ghci> coerce ((+) :: Int -> Int -> Int) :: Identity Int -> Identity Int -> Identity Int
ala. An idiom (implemented in various packages) that simplifies the selection of a newtype that we might want to use with functions like foldMap.
ala Sum foldMap [1,2,3,4 :: Int] :: Int
GeneralizedNewtypeDeriving. An extension for auto-deriving instances for your newtype based on instances available in the underlying type.
DerivingVia A more general extension, for auto-deriving instances for your newtype based on instances available in some other newtype with the same underlying type.
One important thing to note is that Alignment versus Char is not just a matter of clarity, but one of correctness. Your Alignment type expresses the fact that there are only three valid alignments, as opposed to however many inhabitants Char has. By using it, you avoid trouble with invalid values and operations, and also enable GHC to informatively tell you about incomplete pattern matches if warnings are turned on.
As for the synonyms, opinions vary. Personally, I feel type synonyms for small types like Int can increase cognitive load, by making you track different names for what is rigorously the same thing. That said, leftaroundabout makes a great point in that this kind of synonym can be useful in the early stages of prototyping a solution, when you don't necessarily want to worry about the details of the concrete representation you are going to adopt for your domain objects.
(It is worth mentioning that the remarks here about type largely don't apply to newtype. The use cases are different, though: while type merely introduces a different name for the same thing, newtype introduces a different thing by fiat. That can be a surprisingly powerful move -- see danidiaz's answer for further discussion.)
Definitely is good, and here is another example, supose you have this data type with some op:
data Form = Square Int | Rectangle Int Int | EqTriangle Int
perimeter :: Form -> Int
perimeter (Square s) = s * 4
perimeter (Rectangle b h) = (b * h) * 2
perimeter (EqTriangle s) = s * 3
area :: Form -> Int
area (Square s) = s ^ 2
area (Rectangle b h) = (b * h)
area (EqTriangle s) = (s ^ 2) `div` 2
Now imagine you add the circle:
data Form = Square Int | Rectangle Int Int | EqTriangle Int | Cicle Int
add its operations:
perimeter (Cicle r ) = pi * 2 * r
area (Cicle r) = pi * r ^ 2
it is not very good right? Now I want to use Float... I have to change every Int for Float
data Form = Square Double | Rectangle Double Double | EqTriangle Double | Cicle Double
area :: Form -> Double
perimeter :: Form -> Double
but, what if, for clarity and even for reuse, I use type?
data Form = Square Side | Rectangle Side Side | EqTriangle Side | Cicle Radius
type Distance = Int
type Side = Distance
type Radius = Distance
type Area = Distance
perimeter :: Form -> Distance
perimeter (Square s) = s * 4
perimeter (Rectangle b h) = (b * h) * 2
perimeter (EqTriangle s) = s * 3
perimeter (Cicle r ) = pi * 2 * r
area :: Form -> Area
area (Square s) = s * s
area (Rectangle b h) = (b * h)
area (EqTriangle s) = (s * 2) / 2
area (Cicle r) = pi * r * r
That allows me to change the type only changing one line in the code, supose I want the Distance to be in Int, I will only change that
perimeter :: Form -> Distance
...
totalDistance :: [Form] -> Distance
totalDistance = foldr (\x rs -> perimeter x + rs) 0
I want the Distance to be in Float, so I just change:
type Distance = Float
If I want to change it to Int, I have to make some adjustments in the functions, but thats other issue.

Iterating over custom data types in Haskell

I have a custom data type that looks like this:
data Circle = Circle
{ radius :: Float
, xPosition :: Float
, yPosition :: Float
}
I want to be able to write a scale function that can take a given circle and change its size like this:
aCircle = Circle 1.5 1 1
scaleFn aCircle 10
The desired output for this example with scale of 10 would be:
Circle 15 10 10
How can I create a function where I can iterate over each field and multiple the values by a constant? In my actual use case I need a way to map over all the fields as there are many of them.
Scaling by a factor is generally a vector space operation. You could do the following:
{-# LANGUAGE TypeFamilies, DeriveGeneric #-}
import Data.VectorSpace
import GHC.Generics (Generic)
data Circle = Circle
{ radius :: Float
, xPosition :: Float
, yPosition :: Float
} deriving (Generic, Show)
instance AdditiveGroup Circle
instance VectorSpace Circle where
type Scalar Circle = Float
main = print $ Circle 1.5 1 1 ^* 10
(result: Circle {radius = 15.0, xPosition = 10.0, yPosition = 10.0}).
(requires vector-space >= 0.11, which has just added support for generic-derived instances.)
However I should remark that Circle as such is not really a good VectorSpace instance: adding two circles doesn't make any sense, and scaling by a negative factor gives a bogus radius. Only define such an instance if your real use case follows the actual vector space axioms.
What you really want for a type like Circle is something like diagrams' Transformable class. But I don't think there's any automatic way to derive an instance for that. In fact, since diagrams has – unfortunately IMO – switched from vector-space to linear, something like this has become considerably tougher to do even in principle.
You can use "scrap your boilerplate":
import Data.Generics
data Circle = Circle
{ radius :: Float
, xPosition :: Float
, yPosition :: Float
}
deriving (Show, Data)
circleModify :: (Float -> Float) -> Circle -> Circle
circleModify f = gmapT (mkT f)
Intuitively, above, mkT f transforms f into a function which is applicable to any type: if the argument of mkT f is a Float, then f is applied, otherwise the argument is returned as it is.
The newly constructed general function is called a "transformation": the T in mkT stands for that.
Then, gmapT applies the transformation mkT f to all the fields of the circle. Note that is a field contained, say, (Float, Bool) that float would be unaffected. Use everywhere instead of gmapT to recursively go deeper.
Note that I'm not a big fan of this approach. If for any reason you change the type of a field, that change will not trigger a type error but gmapT (mkT ...) will now simply skip over that field.
Generic programming can be convenient, but sometimes a bit too much, in that type errors can be silently transformed into unexpected results at runtime. Use with care.

Haskell Eq between different types

I need to use this data structure data D = C Int Float and I need to compare it with an Int, for example a::D == 2.
How can I create an instance to define this kind of Eq ?
Thank you!
I would implement a projection:
getInt :: D -> Int
getInt (C i _) = i
and then compare with it:
getInt myD == 5
you can even include this into a record:
data D = C { getInt :: Int, getFloat :: Float }
if you like
You can't; == has signature a -> a -> Bool, so it can't be used like this.
Using the convertible package you can define
(~==) :: (Convertible a b, Eq b) => a -> b -> Bool
x ~== y = case safeConvert x of
Right x' -> x' == y
Left _ -> False
(==~) :: (Convertible b a, Eq a) => a -> b -> Bool
(==~) = flip (~==)
instance Convertible Int D where ...
-- or D Int depending on what you have in mind
Without depending on convertible, you could just define a conversion function either from Int to D (and write a == fromInt 2) or vice versa.
A less recommended route (for this specific case I think it's simply worse than the first solution) would be to define your own type class, e.g.
class Eq' a b where
(=~=) :: a -> b -> Bool
instance Eq a => Eq' a a where
x =~= y = x == y
instance Eq' D Int where ...
etc.
This can be done, but I doubt you will want to do this. As Alexey mentioned the type of (==) is Eq a=>a->a->Bool, so the only way to make this work would be to make 2 have type D. This may seem absurd at first, but in fact numbers can be made to have any type you want, as long as that type is an instance of Num
instance Num D where
fromInteger x = C x 1.0
There are still many things to work out, though....
First, you need to fully implement all the functions in Num, including (+), (*), abs, signum, fromInteger, and (negate | (-)).
Ugh!
Second, you have that extra Float to fill in in fromInteger. I chose the value 1.0 above, but that was arbitrary.
Third, you need to actually make D an instance of Eq also, to fill in the actual (==).
instance Eq D where
(C x _) == (C y _) = x == y
Note that this is also pretty arbitrary, as I needed to ignore the Float values to get (==) to do what you want it to.
Bottom line is, this would do what you want it to do, but at the cost of abusing the Num type, and the Eq type pretty badly.... The Num type should be reserved for things that you actually would think of as a number, and the Eq type should be reserved for a comparison of two full objects, each part included.

Why context is not considered when selecting typeclass instance in Haskell?

I understand that when having
instance (Foo a) => Bar a
instance (Xyy a) => Bar a
GHC doesn't consider the contexts, and the instances are reported as duplicate.
What is counterintuitive, that (I guess) after selecting an instance, it still needs to check if the context matches, and if not, discard the instance. So why not reverse the order, and discard instances with non-matching contexts, and proceed with the remaining set.
Would this be intractable in some way? I see how it could cause more constraint resolution work upfront, but just as there is UndecidableInstances / IncoherentInstances, couldn't there be a ConsiderInstanceContexts when "I know what I am doing"?
This breaks the open-world assumption. Assume:
class B1 a
class B2 a
class T a
If we allow constraints to disambiguate instances, we may write
instance B1 a => T a
instance B2 a => T a
And may write
instance B1 Int
Now, if I have
f :: T a => a
Then f :: Int works. But, the open world assumption says that, once something works, adding more instances cannot break it. Our new system doesn't obey:
instance B2 Int
will make f :: Int ambiguous. Which implementation of T should be used?
Another way to state this is that you've broken coherence. For typeclasses to be coherent means that there is only one way to satisfy a given constraint. In normal Haskell, a constraint c has only one implementation. Even with overlapping instances, coherence generally holds true. The idea is that instance T a and instance {-# OVERLAPPING #-} T Int do not break coherence, because GHC can't be tricked into using the former instance in a place where the latter would do. (You can trick it with orphans, but you shouldn't.) Coherence, at least to me, seems somewhat desirable. Typeclass usage is "hidden", in some sense, and it makes sense to enforce that it be unambiguous. You can also break coherence with IncoherentInstances and/or unsafeCoerce, but, y'know.
In a category theoretic way, the category Constraint is thin: there is at most one instance/arrow from one Constraint to another. We first construct two arrows a : () => B1 Int and b : () => B2 Int, and then we break thinness by adding new arrows x_Int : B1 Int => T Int, y_Int : B2 Int => T Int such that x_Int . a and y_Int . b are both arrows () => T Int that are not identical. Diamond problem, anyone?
This does not answer you question as to why this is the case. Note, however, that you can always define a newtype wrapper to disambiguate between the two instances:
newtype FooWrapper a = FooWrapper a
newtype XyyWrapper a = XyyWrapper a
instance (Foo a) => Bar (FooWrapper a)
instance (Xyy a) => Bar (XyyWrapper a)
This has the added advantage that by passing around either a FooWrapper or a XyyWrapper you explicitly control which of the two instances you'd like to use if your a happens to satisfy both.
Classes are a bit weird. The original idea (which still pretty much works) is a sort of syntactic sugar around what would otherwise be data statements. For example you can imagine:
data Num a = Num {plus :: a -> a -> a, ... , fromInt :: Integer -> a}
numInteger :: Num Integer
numInteger = Num (+) ... id
then you can write functions which have e.g. type:
test :: Num x -> x -> x -> x -> x
test lib a b c = a + b * (abs (c + b))
where (+) = plus lib
(*) = times lib
abs = absoluteValue lib
So the idea is "we're going to automatically derive all of this library code." The question is, how do we find the library that we want? It's easy if we have a library of type Num Int, but how do we extend it to "constrained instances" based on functions of type:
fooLib :: Foo x -> Bar x
xyyLib :: Xyy x -> Bar x
The present solution in Haskell is to do a type-pattern-match on the output-types of those functions and propagate the inputs to the resulting declaration. But when there's two outputs of the same type, we would need a combinator which merges these into:
eitherLib :: Either (Foo x) (Xyy x) -> Bar x
and basically the problem is that there is no good constraint-combinator of this kind right now. That's your objection.
Well, that's true, but there are ways to achieve something morally similar in practice. Suppose we define some functions with types:
data F
data X
foobar'lib :: Foo x -> Bar' x F
xyybar'lib :: Xyy x -> Bar' x X
bar'barlib :: Bar' x y -> Bar x
Clearly the y is a sort of "phantom type" threaded through all of this, but it remains powerful because given that we want a Bar x we will propagate the need for a Bar' x y and given the need for the Bar' x y we will generate either a Bar' x X or a Bar' x y. So with phantom types and multi-parameter type classes, we get the result we want.
More info: https://www.haskell.org/haskellwiki/GHC/AdvancedOverlap
Adding backtracking would make instance resolution require exponential time, in the worst case.
Essentially, instances become logical statements of the form
P(x) => R(f(x)) /\ Q(x) => R(f(x))
which is equivalent to
(P(x) \/ Q(x)) => R(f(x))
Computationally, the cost of this check is (in the worst case)
c_R(n) = c_P(n-1) + c_Q(n-1)
assuming P and Q have similar costs
c_R(n) = 2 * c_PQ(n-1)
which leads to exponential growth.
To avoid this issue, it is important to have fast ways to choose a branch, i.e. to have clauses of the form
((fastP(x) /\ P(x)) \/ (fastQ(x) /\ Q(x))) => R(f(x))
where fastP and fastQ are computable in constant time, and are incompatible so that at most one branch needs to be visited.
Haskell decided that this "fast check" is head compatibility (hence disregarding contexts). It could use other fast checks, of course -- it's a design decision.

How to have an operator which adds/subtracts both absolute and relative values, in Haskell

(Apologies for the weird title, but I could not think of a better one.)
For a personal Haskell project I want to have the concepts of 'absolute values' (like a frequency) and relative values (like the ratio between two frequencies). In my context, it makes no sense to add two absolute values: one can add relative values to produce new relative values, and add a relative value to an absolute one to produce a new absolute value (and likewise for subtraction).
I've defined type classes for these: see below. However, note that the operators ##+ and #+ have a similar structure (and likewise for ##- and #-). Therefore I would prefer to merge these operators, so that I have a single addition operator, which adds a relative value (and likewise a single subtraction operator, which results in a relative value). UPDATE: To clarify, my goal is to unify my ##+ and #+ into a single operator. My goal is not to unify this with the existing (Num) + operator.
However, I don't see how to do this with type classes.
Question: Can this be done, and if so, how? Or should I not be trying?
The following is what I currently have:
{-# LANGUAGE MultiParamTypeClasses #-}
class Abs a where
nullPoint :: a
class Rel r where
zero :: r
(##+) :: r -> r -> r
neg :: r -> r
(##-) :: Rel r => r -> r -> r
r ##- s = r ##+ neg s
class (Abs a, Rel r) => AbsRel a r where
(#+) :: a -> r -> a
(#-) :: a -> a -> r
I think you're looking for a concept called a Torsor. A torsor consists of set of values, set of differences, and operator which adds a difference to a value. Additionally, the set of differences must form an additive group, so differences also can be added together.
Interestingly, torsors are everywhere. Common examples include
Points and Vectors
Dates and date-differences
Files and diffs
etc.
One possible Haskell definition is:
class Torsor a where
type TorsorOf a :: *
(.-) :: a -> a -> TorsorOf a
(.+) :: a -> TorsorOf a -> a
Here are few example instances:
instance Torsor UTCTime where
type TorsorOf UTCTime = NominalDiffTime
a .- b = diffUTCTime a b
a .+ b = addUTCTime b a
instance Torsor Double where
type TorsorOf Double = Double
a .- b = a - b
a .+ b = a + b
instance Torsor Int where
type TorsorOf Int = Int
a .- b = a - b
a .+ b = a + b
In the last case, notice that the two sets of the torsors don't need to be a different set, which makes adding your relative values together simple.
For more information, see a much nicer description in Roman Cheplyakas blog
I don't think you should be trying to unify these operators. Subtracting two vectors and subtracting two points are fundamentally different operations. The fact that it's difficult to represent them as the same thing in the type system is not the type system being awkward - it's because these two concepts really are different things!
The mathematical framework behind what you're working with is the affine space.
These are already available in Haskell in the vector-space package (do cabal install vector-space at the command prompt). Rather than using multi parameter type classes, they use type families to associate a vector (relative) type with each point (absolute) type.
Here's a minimal example showing how to define your own absolute and relative data types, and their interaction:
{-# LANGUAGE TypeFamilies #-}
import Data.VectorSpace
import Data.AffineSpace
data Point = Point { px :: Float, py :: Float }
data Vec = Vec { vx :: Float, vy :: Float }
instance AdditiveGroup Vec where
zeroV = Vec 0 0
negateV (Vec x y) = Vec (-x) (-y)
Vec x y ^+^ Vec x' y' = Vec (x+x') (y+y')
instance AffineSpace Point where
type Diff Point = Vec
Point x y .-. Point x' y' = Vec (x-x') (y-y')
Point x y .+^ Vec x' y' = Point (x+x') (y+y')
You have two answers telling you what you should do, here's another answer telling you how to do what you asked for (which might not be a good idea). :)
class Add a b c | a b -> c where
(#+) :: a -> b -> c
instance Add AbsTime RelTime AbsTime where
(#+) = ...
instance Add RelTime RelTime RelTime where
(#+) = ...
The overloading for (#+) makes it very flexible. Too flexible, IMO. The only restraint is that the result type is determined by the argument types (without this FD the operator becomes almost unusable because it constrains nothing).

Resources