Expand Haskell datatypes - haskell

Is it possible to expand data types with new values?
E.g.:
The following compiles:
data Axes2D = X | Y
data Axes3D = Axes2D | Z
But, the following:
data Axes2D = X | Y deriving (Show, Eq)
data Axes3D = Axes2D | Z deriving (Show, Eq)
type Point2D = (Int, Int)
type Point3D = (Int, Int, Int)
move_along_axis_2D :: Point2D -> Axes2D -> Int -> Point2D
move_along_axis_2D (x, y) axis move | axis == X = (x + move, y)
| otherwise = (x, y + move)
move_along_axis_3D :: Point3D -> Axes3D -> Int -> Point3D
move_along_axis_3D (x, y, z) axis move | axis == X = (x + move, y, z)
| axis == y = (x, y + move, z)
| otherwise = (x, y, z + move)
gives the following compiling error (move_along_axis_3D commented out doesn't give errors):
Prelude> :l expandTypes_test.hs
[1 of 1] Compiling Main ( expandTypes_test.hs, interpreted )
expandTypes_test.hs:12:50:
Couldn't match expected type `Axes3D' with actual type `Axes2D'
In the second argument of `(==)', namely `X'
In the expression: axis == X
In a stmt of a pattern guard for
an equation for `move_along_axis_3D':
axis == X
Failed, modules loaded: none.
So is it possible to make X and Y of type Axes2D as well of type Axes3D?
If it is possible: what am I doing wrong? Else: why is it not possible?

Along with what Daniel Fischer said, to expand on why this is not possible: the problems with the kind of subtyping you want run deeper than just naming ambiguity; they make type inference a lot more difficult in general. I think Scala's type inference is a lot more restricted and local than Haskell's for this reason.
However, you can model this kind of thing with the type-class system:
class (Eq t) => HasAxes2D t where
axisX :: t
axisY :: t
class (HasAxes2D t) => HasAxes3D t where
axisZ :: t
data Axes2D = X | Y deriving (Eq, Show)
data Axes3D = TwoD Axes2D | Z deriving (Eq, Show)
instance HasAxes2D Axes2D where
axisX = X
axisY = Y
instance HasAxes2D Axes3D where
axisX = TwoD X
axisY = TwoD Y
instance HasAxes3D Axes3D where
axisZ = Z
You can then use guards to "pattern-match" on these values:
displayAxis :: (HasAxes2D t) => t -> String
displayAxis axis
| axis == axisX = "X"
| axis == axisY = "Y"
| otherwise = "Unknown"
This has many of the same drawbacks as subtyping would have: uses of axisX, axisY and axisZ will have a tendency to become ambiguous, requiring type annotations that defeat the point of the exercise. It's also a fair bit uglier to write type signatures with these type-class constraints, compared to using concrete types.
There's another downside: with the concrete types, when you write a function taking an Axes2D, once you handle X and Y you know that you've covered all possible values. With the type-class solution, there's nothing stopping you from passing Z to a function expecting an instance of HasAxes2D. What you really want is for the relation to go the other way around, so that you could pass X and Y to functions expecting a 3D axis, but couldn't pass Z to functions expecting a 2D axis. I don't think there's any way to model that correctly with Haskell's type-class system.
This technique is occasionally useful — for instance, binding an OOP library like a GUI toolkit to Haskell — but generally, it's more natural to use concrete types and explicitly favour what in OOP terms would be called composition over inheritance, i.e. explicitly wrapping "subtypes" in a constructor. It's not generally much of a bother to handle the constructor wrapping/unwrapping, and it's more flexible besides.

It is not possible. Note that in
data Axes2D = X | Y
data Axes3D = Axes2D | Z
the Axes2D in the Axes3D type is a value constructor taking no arguments, so Axes3D has two constructors, Axes2D and Z.
Different types cannot have value constructors with the same name (in the same scope) because that would make type inference impossible. What would
foo X = True
foo _ = False
have as a type? (It's a bit different with parametric types, all Maybe a have value constructors with the same name, and that works. But that's because Maybe takes a type parameter, and the names are shared only among types constructed with the same (unary) type constructor. It doesn't work for nullary type constructors.)

You can do it with Generalized Algebraic Data Types. We can create a generic (GADT) type with data constructors that have type constraints. Then we can define specialized types (type aliases) that specifies the full type and thus limiting which constructors are allowed.
{-# LANGUAGE GADTs #-}
data Zero
data Succ a
data Axis a where
X :: Axis (Succ a)
Y :: Axis (Succ (Succ a))
Z :: Axis (Succ (Succ (Succ a)))
type Axis2D = Axis (Succ (Succ Zero))
type Axis3D = Axis (Succ (Succ (Succ Zero)))
Now, you are guaranteed to only have X and Y passed into a function that is defined to take an argument of Axis2D. The constructor Z fails to match the type of Axis2D.
Unfortunately, GADTs do not support automatic deriving, so you will need to provide your own instances, such as:
instance Show (Axis a) where
show X = "X"
show Y = "Y"
show Z = "Z"
instance Eq (Axis a) where
X == X = True
Y == Y = True
Z == Z = True
_ == _ = False

Related

Is there a way to bind the supressed type variable of an existential data type during pattern matching?

Using GADTs, I have defined a depth-indexed tree data type (2–3 tree). The depth is there to statically ensure that the trees are balanced.
-- Natural numbers
data Nat = Z | S Nat
-- Depth-indexed 2-3 tree
data DT :: Nat -> Type -> Type where
-- Pattern of node names: N{#subtrees}_{#containedValues}
N0_0 :: DT Z a
N2_1 :: DT n a -> a -> DT n a
-> DT (S n) a
N3_2 :: DT n a -> a -> DT n a -> a -> DT n a
-> DT (S n) a
deriving instance Eq a => Eq (DT n a)
Now, some operations (e.g. insertion) might or might not change the depth of the tree. So I want to hide it from the type signature. I do this using existential data types.
-- 2-3 tree
data T :: Type -> Type where
T :: {unT :: DT n a} -> T a
insert :: a -> T a -> T a
insert x (T dt) = case dt of
N0_0 -> T $ N2_1 N0_0 x N0_0
{- ... -}
So far so good. My problem is:
I don't see how I can now define Eq on T.
instance Eq a => Eq (T a) where
(T x) == (T y) = _what
Obviously, I would like to do something like this:
(T {n = nx} x) == (T {n = ny} y)
| nx == ny = x == y
| otherwise = False
I don't know how / whether I can bind the type variables in the patter match. And I am neither sure how to compare them once I get them.
(I suspect Data.Type.Equality is for this, but I haven't seen any example of it in use.)
So, is there a way to implement the Eq (T a) instance, or is there some other approach that is recommended in this case?
You should write a depth-independent equality operator, which is able to compare two trees even if they have different depths n and m.
dtEq :: Eq a => DT n a -> DT m a -> Bool
dtEq N0_0 N0_0 = True
dtEq (N2_1 l1 x1 r1) (N2_1 l2 x2 r2) =
dtEq l1 l2 && x1 == x2 && dtEq r1 r2
dtEq (N3_2 a1 x1 b1 y1 c1) (N3_2 a2 x2 b2 y2 c2) =
dtEq a1 a2 && x1 == x2 && dtEq b1 b2 && y1 == y2 && dtEq c1 c2
dtEq _ _ = False
Then, for your existential type:
instance Eq a => Eq (T a) where
(T x) == (T y) = dtEq x y
Even if in the last line the depths are unknown (because of the existential), it won't matter for dtEq since it can accept any depth.
Minor side note: dtEq exploits polymorphic recursion, in that recursive calls can use a different depth from the one in the original call. Haskell allows polymorphic recursion, as long as an explicit type signature is provided. (We need one anyway, since we are using GADTs.)
You could use Data.Coerce.coerce to compare the contents of the trees: as long as you label the depth parameter as phantom, it should be willing to give you coerce :: DT n a -> DT m a.
But this doesn't really solve the problem, of course: you want to know if their types are the same. Well, maybe there is some solution with Typeable, but it doesn't sound like much fun. Absent Typeable, it seems impossible to me, because you want two contradictory things.
First, you want that trees of different depths should be separate types, not intermixable at all. This means everyone who handles them has to know what type they are.
Second, you want that you can give such a tree to someone without telling them how deep it is, have them munge it around arbitrarily, and then give it back to you. How can they do that, if you require type knowledge to operate on them?
Existentials do not "suppress" type information: they throw it away. Like all type information, it is gone at runtime; and you've made it invisible at compile time too.
I'm also not sure your problem is just with Eq: how will you even implement functions like insert? It's easy for N0_0, because that is known to have type DT Z a, but for the other cases I don't see how you will construct a DT (S n) a to wrap in your T when you can't know what n was.

OCaml functors (parametrized modules) emulation in Haskell

Is there any recommended way to use typeclasses to emulate OCaml-like parametrized modules?
For an instance, I need the module that implements the complex
generic computation, that may be parmetrized with different
misc. types, functions, etc. To be more specific, let it be
kMeans implementation that could be parametrized with different
types of values, vector types (list, unboxed vector, vector, tuple, etc),
and distance calculation strategy.
For convenience, to avoid crazy amount of intermediate types, I want to
have this computation polymorphic by DataSet class, that contains all
required interfaces. I also tried to use TypeFamilies to avoid a lot
of typeclass parameters (that cause problems as well):
{-# Language MultiParamTypeClasses
, TypeFamilies
, FlexibleContexts
, FlexibleInstances
, EmptyDataDecls
, FunctionalDependencies
#-}
module Main where
import qualified Data.List as L
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import Distances
-- contains instances for Euclid distance
-- import Distances.Euclid as E
-- contains instances for Kulback-Leibler "distance"
-- import Distances.Kullback as K
class ( Num (Elem c)
, Ord (TLabel c)
, WithDistance (TVect c) (Elem c)
, WithDistance (TBoxType c) (Elem c)
)
=> DataSet c where
type Elem c :: *
type TLabel c :: *
type TVect c :: * -> *
data TDistType c :: *
data TObservation c :: *
data TBoxType c :: * -> *
observations :: c -> [TObservation c]
measurements :: TObservation c -> [Elem c]
label :: TObservation c -> TLabel c
distance :: TBoxType c (Elem c) -> TBoxType c (Elem c) -> Elem c
distance = distance_
instance DataSet () where
type Elem () = Float
type TLabel () = Int
data TObservation () = TObservationUnit [Float]
data TDistType ()
type TVect () = V.Vector
data TBoxType () v = VectorBox (V.Vector v)
observations () = replicate 10 (TObservationUnit [0,0,0,0])
measurements (TObservationUnit xs) = xs
label (TObservationUnit _) = 111
kMeans :: ( Floating (Elem c)
, DataSet c
) => c
-> [TObservation c]
kMeans s = undefined -- here the implementation
where
labels = map label (observations s)
www = L.map (V.fromList.measurements) (observations s)
zzz = L.zipWith distance_ www www
wtf1 = L.foldl wtf2 0 (observations s)
wtf2 acc xs = acc + L.sum (measurements xs)
qq = V.fromList [1,2,3 :: Float]
l = distance (VectorBox qq) (VectorBox qq)
instance Floating a => WithDistance (TBoxType ()) a where
distance_ xs ys = undefined
instance Floating a => WithDistance V.Vector a where
distance_ xs ys = sqrt $ V.sum (V.zipWith (\x y -> (x+y)**2) xs ys)
This code somehow compiles and work, but it's pretty ugly and hacky.
The kMeans should be parametrized by value type (number, float point number, anything),
box type (vector,list,unboxed vector, tuple may be) and distance calculation strategy.
There are also types for Observation (that's the type of sample provided by user,
there should be a lot of them, measurements that contained in each observation).
So the problems are:
1) If the function does not contains the parametric types in it's signature,
types will not be deduced
2) Still no idea, how to declare typeclass WithDistance to have different instances
for different distance type (Euclid, Kullback, anything else via phantom types).
Right now WithDistance just polymorphic by box type and value type, so if we need
different strategies, we may only put them in different modules and import the required
module. But this is a hack and non-typed approach, right?
All of this may be done pretty easy in OCaml with is't modules. What the proper approach
to implement such things in Haskell?
Typeclasses with TypeFamilies somehow look similar to parametric modules, but they
work different. I really need something like that.
It is really the case that Haskell lacks useful features found in *ML module systems.
There is ongoing effort to extend Haskell's module system: http://plv.mpi-sws.org/backpack/
But I think you can get a bit further without those ML modules.
Your design follows God class anti-pattern and that is why it is anti-modular.
Type class can be useful only if every type can have no more than a single instance of that class. E.g. DataSet () instance fixes type TVect () = V.Vector and you can't easily create similar instance but with TVect = U.Vector.
You need to start with implementing kMeans function, then generalize it by replacing concrete types with type variables and constraining those type variables with type classes when needed.
Here is little example. At first you have some non-general implementation:
kMeans :: Int -> [(Double,Double)] -> [[(Double,Double)]]
kMeans k points = ...
Then you generalize it by distance calculation strategy:
kMeans
:: Int
-> ((Double,Double) -> (Double,Double) -> Double)
-> [(Double,Double)]
-> [[(Double,Double)]]
kMeans k distance points = ...
Now you can generalize it by type of points, but this requires introducing a class that will capture some properties of points that are used by distance computation e.g. getting list of coordinates:
kMeans
:: Point p
=> Int -> (p -> p -> Coord p) -> [p]
-> [[p]]
kMeans k distance points = ...
class Num (Coord p) => Point p where
type Coord p
coords :: p -> [Coord p]
euclidianDistance
:: (Point p, Floating (Coord p))
=> p -> p -> Coord p
euclidianDistance a b
= sum $ map (**2) $ zipWith (-) (coords a) (coords b)
Now you may wish to make it a bit faster by replacing lists with vectors:
kMeans
:: (Point p, DataSet vec p)
=> Int -> (p -> p -> Coord p) -> vec p
-> [vec p]
kMeans k distance points = ...
class DataSet vec p where
map :: ...
foldl' :: ...
instance Unbox p => DataSet U.Vector p where
map = U.map
foldl' = U.foldl'
And so on.
Suggested approach is to generalize various parts of algorithm and constrain those parts with small loosely coupled type classes (when required).
It is a bad style to collect everything in a single monolithic type class.

Haskell - create instance of class (how to do it right?)

I read the chapter about that topic in "learn you a haskell" and tried to find some hints on different websites - but are still unable to solve the following task.
Im a haskell newbie (6 weeks of "experience") and its the first time I have to work with instances.
So here is the task, my code has to pass the HUnit tests and the end. I tried to implement the instances but it seems like I´ve missed something there. Hope you can help me! THX
module SemiGroup where
{-
A type class 'SemiGroup' is given. It has exactly one method: a binary operation
called '(<>)'. Also a data type 'Tree' a newtype 'Sum' and a newtype 'Max' are
given. Make them instances of the 'SemiGroup' class.
The 'Tree' instance should build a 'Branch' of the given left and right side.
The 'Sum' instance should take the sum of its given left and right side. You need
a 'Num' constraint for that.
The 'Max' instance should take the maximum of its given left and right side. You
also need a constraint for that but you have to figure out yourself which one.
This module is not going to compile until you add the missing instances.
-}
import Test.HUnit (runTestTT,Test(TestLabel,TestList),(~?=))
-- | A semigroup has a binary operation.
class SemiGroup a where
(<>) :: a -> a -> a
-- Leaf = Blatt, Branch = Ast
-- | A binary tree data type.
data Tree a = Leaf a
| Branch (Tree a) (Tree a)
deriving (Eq,Show)
-- | A newtype for taking the sum.
newtype Sum a = Sum {unSum :: a}
-- | A newtype for taking the maximum.
newtype Max a = Max {unMax :: a}
instance SemiGroup Tree where
(<>) x y = ((x) (y))
instance SemiGroup (Num Sum) where
(<>) x y = x+y
instance SemiGroup (Eq Max) where
(<>) x y = if x>y then x else y
-- | Tests the implementation of the 'SemiGroup' instances.
main :: IO ()
main = do
testresults <- runTestTT tests
print testresults
-- | List of tests for the 'SemiGroup' instances.
tests :: Test
tests = TestLabel "SemiGroupTests" (TestList [
Leaf "Hello" <> Leaf "Friend" ~?= Branch (Leaf "Hello") (Leaf "Friend"),
unSum (Sum 4 <> Sum 8) ~?= 12,
unMax (Max 8 <> Max 4) ~?= 8])
I tried something like:
class SemiGroup a where
(<>) :: a -> a -> a
-- Leaf = Blatt, Branch = Ast
-- | A binary tree data type.
data Tree a = Leaf a
| Branch (Tree a) (Tree a)
deriving (Eq,Show)
-- | A newtype for taking the sum.
newtype Sum a = Sum {unSum :: a}
-- | A newtype for taking the maximum.
newtype Max a = Max {unMax :: a}
instance SemiGroup Tree where
x <> y = Branch x y
instance Num a => SemiGroup (Sum a) where
x <> y = x+y
instance Eq a => SemiGroup (Max a) where
x <> y = if x>y then x else y
But there a still some failures left! At least the wrap/unwrap thing that "chi" mentioned. But I have no idea. maybe another hint ? :/
I fail to see how to turn Tree a into a semigroup (unless it has to be considered up-to something).
For the Sum a newtype, you need to require that a is of class Num. Then, you need to wrap/unwrap the Sum constructor around values so that: 1) you take two Sum a, 2) you convert them into two a, which is a proper type over which + is defined, 3) you sum them, 4) you turn the result back into a Sum a.
You can try to code the above yourself starting from
instance Num a => Semigroup (Sum a) where
x <> y = ... -- Here both x and y have type (Sum a)
The Max a instance will require a similar wrap/unwrap code.
A further hint: to unwrap a Sum a into an a you can use the function
unSum :: Sum a -> a
to wrap an a into a Sum a you can use instead
Sum :: a -> Sum a
Note that both functions Sum, unSum are already implicitly defined by your newtype declaration, so you do not have to define them (you already did).
Alternatively, you can use pattern matching to unwrap your values. Instead of defining
x <> y = ... -- x,y have type Sum a (they are wrapped)
you can write
Sum x <> Sum y = ... -- x,y have type a (they are unwrapped)
Pay attention to the types. Either manually, or with some help from GHCi, figure out the type of the functions you are writing -- you'll find they don't match the types that the typeclass instance needs. You'll use wrapping and unwrapping to adjust the types until they work.

Haskell Eq definition realizing a result

I was reading the definition for the Eq typeclass in the Data library, and I'm confused. At what point is it realized that two values are equal or not equal. From what I see, it looks like they would just call each other ad infinitum.
It's defined as so:
class Eq a where
(==), (/=) :: a -> a -> Bool
x /= y = not (x == y)
x == y = not (x /= y)
Would somebody mind explaining where it realizes the Bool value? Are they even calling each other, or is something else going on?
That’s the default implementation of those methods, and yes, it is circular. If you use them as-is, you’ll loop:
data Foo = Foo
instance Eq Foo
> Foo == Foo
^CInterrupted
The circular definitions exist so you can implement (==) and get (/=) for free, or vice versa:
data Foo = Foo
instance Eq Foo where
x == y = True
> Foo /= Foo
False
See also the Ord class, which explains what the minimal complete definition is in that particular case.

What is the differences between class and instance declarations?

I am currently reading this, but if I am honest I am struggling to see what
class Eq a where
(==) :: a -> a -> Bool
achieves, which
instance Eq Integer where
x == y = x `integerEq` y
doesnt achieve. I understand the second code defines what the result of performing equality on two Integer types should be. What is the purpose of the first then??
The class declaration says "I'm going to define a bunch of functions now which will work for several different types". The instance declaration says "this is how these functions work for this type".
In your specific example, class Eq says that "Eq means any type that has a function named ==", whereas the instance Eq Integer says "this is how == works for an Integer".
The first defines what operations must be provided for a type to be comparable for equality. You can then use that to write functions that operate on any type that is comparable for equality, not just integers.
allSame :: Eq a => [a] -> Bool
allSame [] = []
allSame (x:xs) = foldr True (==x) xs
This function works for integers because instances for Eq Integer exists. It also works for strings ([Char]) because an instance for Eq Char exists, and an instance for lists of types that have instances of Eq also exists (instance Eq a => Eq [a]).
There is one class and many instances for different types. That's why the class specifies the required signature (interface; classes can also specify default implementations, but that's beside the point), and instance the body (implementation). You then use class name as a constraint that means "any type a that implements Eq operations, i.e. have an instance in Eq".
Read Learn you a Haskell or Real World Haskell, they're better than the haskell.org tutorial.
Let's say you want to implement a generic algorithm or data structure, "generic" meaning polymorphic: it should work for any data type. For example, let's say you want to write a function that determines whether three input values are equal.
Taking a specific (monomorphic) case, you can do this for integers:
eq3 :: Int -> Int -> Int -> Bool
eq3 x y z = x == y && y == z
We'd expect the above definition to work for other types as well, of course, but if we simply tell the compiler that the function should apply to any type:
eq3 :: a -> a -> a -> Bool
eq3 x y z = x == y && y == z
... the compiler complains that the == function doesn't apply to our generic a:
<interactive>:12:49:
No instance for (Eq a)
arising from a use of `=='
In the first argument of `(&&)', namely `x == y'
In the expression: x == y && y == z
In an equation for `eq3': eq3 x y z = x == y && y == z
We have to tell the compiler that our type a is an instance of the Eq type class, which you already noticed is where the == function is declared. See the difference here:
eq3 :: Eq a => a -> a -> a -> Bool
eq3 x y z = x == y && y == z
Now we have a function that can operate uniformly on any type a belonging to the Eq type class.

Resources