Design pattern for subclass-like structure - haskell

My goal is to represent a set of types with a similar behaviour in a elegant and performant manner. To achieve this, I have created a solution that utilises a single type, followed by a set of functions that perform pattern matching.
My first question is: is there a way how to represent the same ideas using a single type-class and instead of having a constructor per each variation to have a type that implements said type-class?
Which of the two approaches below is:
- a better recognised design pattern in Haskell?
- more memory efficient?
- more performant?
- more elegant and why?
- easier to use for consumers of the code?
Approach 1: Single type and pattern matching
Suppose there is a following structure:
data Aggregate a
= Average <some necessary state keeping>
| Variance <some necessary state keeping>
| Quantile a <some necessary state keeping>
It's constructors are not public as that would expose the internal state keeping. Instead, a set of constructor functions exist:
newAverage :: Floating a
=> Aggregate a
newAverage = Average ...
newVariance :: Floating a
=> Aggregate a
newVariance = Variance ...
newQuantile :: Floating a
=> a -- ! important, a parameter to the function
-> Aggregate a
newQuantile p = Quantile p ...
Once the object is created, we can perform two functions: put values into it, and once we are satisfied, we can get the current value:
get :: Floating a
=> Aggregate a
-> Maybe a
get (Average <state>) = getAverage <state>
get (Variance <state>) = getVariance <state>
get (Quantile _ <state>) = getQuantile <state>
put :: Floating a
=> a
-> Aggregate a
-> Aggregate a
put newVal (Average <state>) = putAverage newVal <state>
put newVal (Variance <state>) = putVariance newVal <state>
put newVal (Quantile p <state>) = putQuantile newVal p <state>
Approach 2: Type-classes and instances
class Aggregate a where
new :: a
get :: Floating f => a f -> Maybe f
put :: Floating f =>
data Average a = Average Word64 a
data Variance a ...
instance Aggregate Average where
instance Aggregate Variance where
instance Aggregate Quantile where
The obvious problem here is the fact that new is not parametric and thus Quantile can't be initialised with the p parameter. Adding a parameter to new is possible, but it would result in all other non-parametric constructors to ignore the value, which is not a good design.

You are missing the "codata" encoding, which sounds like it might be the best fit for your problem.
data Aggregate a = Aggregate
{ get :: Maybe a
, put :: a -> Aggregate a
}
-- Use the closure to keep track of local state.
newAverage :: (Floating a) => Aggregate a
newAverage = Aggregate { get = Nothing, put = go 0 0 }
where
go n total x = Aggregate { get = Just ((total + x) / (n+1))
, put = go (n+1) (total+x)
}
-- Parameters are not a problem.
newQuantile :: (Floating a) => a -> Aggregate a
newQuantile p = Aggregate { get = ... ; put = \x -> ... }
...
For some reason this approach always slips under the radar of people with OO backgrounds, which is strange because it is a pretty close match to that paradigm.

It's hard to give a general recommendation. I tend to prefer approach 1. Note that you could use
data Aggregate a
= Average AverageState
| Variance VarianceState
| Quantile a QuantileState
and export every constructor above, keeping only the ...State types private to the module.
This might be feasible in some contexts, but not in others, so it has to be evaluated on a case by case basis.
About approach 2, this could be more convenient if you have many constructors / types around. To fix the new problem, one could use type families (or fundeps) as in
class Floating f => Aggregate a f where
type AggregateNew a f
new :: AggregateNew a f -> a f
get :: a f -> Maybe f
put :: ...
instance Floating f => Aggregate Average f where
type AggregateNew (Average a) f = ()
new () = ...
instance Floating f => Aggregate Quantile f where
type AggregateNew (Quantile a) f = a
new x = ...
The naming above is horrible, but I used it to make the point. new takes an argument of type AggregateNew k f which can be () if new needs no information, or some more informative type when it is needed, like a for creating a Quantile.

There is a third approach for defining “aggregators” that neither requires an inextensible sum type nor multiple datatypes + a typeclass.
Approach 3: Single-constructor datatype that puts state behind an existential
Consider this type:
{-# LANGUAGE ExistentialQuantification #-}
data Fold a b = forall x. Fold (x -> a -> x) x (x -> b)
It represents an aggregator that ingests values of type a and eventually "returns" a value of type b while carrying an internal state x.
The constructor has type (x -> a -> x) -> x -> (x -> b) -> Fold a b. It takes a step function, an initial state and a final "summary" function. Notice that the state is decoupled from the return value b. They can be the same, but it's not required.
Also, the state is existentially quantified. We know the type of the state when we create a Fold, and we can work with it when we pattern-match on the Fold—in order to feed it data through the step function—but it's not reflected in the Fold's type. We can put Fold values with different inner states in the same container without problems, as long as they ingest and return the same types.
This pattern is sometimes called a “beautiful fold”. There is a library called foldl that is based on it, and provides may premade folds and utility functions.
The Fold type in foldl has many useful instances. In particular, the Applicative instance lets us create composite folds that still traverse the input data a single time, instead of requiring multiple passes.

Related

Practical applications of Rank 2 polymorphism?

I'm covering polymorphism and I'm trying to see the practical uses of such a feature.
My basic understanding of Rank 2 is:
type MyType = ∀ a. a -> a
subFunction :: a -> a
subFunction el = el
mainFunction :: MyType -> Int
mainFunction func = func 3
I understand that this is allowing the user to use a polymorphic function (subFunction) inside mainFunction and strictly specify it's output (Int). This seems very similar to GADT's:
data Example a where
ExampleInt :: Int -> Example Int
ExampleBool :: Bool -> Example Bool
1) Given the above, is my understanding of Rank 2 polymorphism correct?
2) What are the general situations where Rank 2 polymorphism can be used, as opposed to GADT's, for example?
If you pass a polymorphic function as and argument to a Rank2-polymorphic function, you're essentially passing not just one function but a whole family of functions – for all possible types that fulfill the constraints.
Typically, those forall quantifiers come with a class constraint. For example, I might wish to do number arithmetic with two different types simultaneously (for comparing precision or whatever).
data FloatCompare = FloatCompare {
singlePrecision :: Float
, doublePrecision :: Double
}
Now I might want to modify those numbers through some maths operation. Something like
modifyFloat :: (Num -> Num) -> FloatCompare -> FloatCompare
But Num is not a type, only a type class. I could of course pass a function that would modify any particular number type, but I couldn't use that to modify both a Float and a Double value, at least not without some ugly (and possibly lossy) converting back and forth.
Solution: Rank-2 polymorphism!
modifyFloat :: (∀ n . Num n => n -> n) -> FloatCompare -> FloatCompare
mofidyFloat f (FloatCompare single double)
= FloatCompare (f single) (f double)
The best single example of how this is useful in practice are probably lenses. A lens is a “smart accessor function” to a field in some larger data structure. It allows you to access fields, update them, gather results... while at the same time composing in a very simple way. How it works: Rank2-polymorphism; every lens is polymorphic, with the different instantiations corresponding to the “getter” / “setter” aspects, respectively.
The go-to example of an application of rank-2 types is runST as Benjamin Hodgson mentioned in the comments. This is a rather good example and there are a variety of examples using the same trick. For example, branding to maintain abstract data type invariants across multiple types, avoiding confusion of differentials in ad, a region-based version of ST.
But I'd actually like to talk about how Haskell programmers are implicitly using rank-2 types all the time. Every type class whose methods have universally quantified types desugars to a dictionary with a field with a rank-2 type. In practice, this is virtually always a higher-kinded type class* like Functor or Monad. I'll use a simplified version of Alternative as an example. The class declaration is:
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
The dictionary representing this class would be:
data AlternativeDict f = AlternativeDict {
empty :: forall a. f a,
(<|>) :: forall a. f a -> f a -> f a }
Sometimes such an encoding is nice as it allows one to use different "instances" for the same type, perhaps only locally. For example, Maybe has two obvious instances of Alternative depending on whether Just a <|> Just b is Just a or Just b. Languages without type classes, such as Scala, do indeed use this encoding.
To connect to leftaroundabout's reference to lenses, you can view the hierarchy there as a hierarchy of type classes and the lens combinators as simply tools for explicitly building the relevant type class dictionaries. Of course, the reason it isn't actually a hierarchy of type classes is that we usually will have multiple "instances" for the same type. E.g. _head and _head . _tail are both "instances" of Traversal' s a.
* A higher-kinded type class doesn't necessarily lead to this, and it can happen for a type class of kind *. For example:
-- Higher-kinded but doesn't require universal quantification.
class Sum c where
sum :: c Int -> Int
-- Not higher-kinded but does require universal quantification.
class Length l where
length :: [a] -> l
If you are using modules in Haskell, you are already using Rank-2 types. Theoretically speaking, modules are records with rank-2 type properties.
For example, the Foo module below in Haskell ...
module Foo(id) where
id :: forall a. a -> a
id x = x
import qualified Foo
main = do
putStrLn (Foo.id "hello")
return ()
... can actually be thought as a record as follows:
type FooType = FooType {
id :: forall a. a -> a
}
Foo :: FooType
Foo = Foo {
id = \x -> x
}
P/S (unrelated this question): from a language design perspective, if you are going to support module system, then you might as well support higher-rank types (i.e. allow arbitrary quantification of type variables on any level) to reduce duplication of efforts (i.e. type checking a module should be almost the same as type checking a record with higher rank types).

Are there useful applications for the Divisible Type Class?

I've lately been working on an API in Elm where one of the main types is contravariant. So, I've googled around to see what one can do with contravariant types and found that the Contravariant package in Haskell defines the Divisible type class.
It is defined as follows:
class Contravariant f => Divisible f where
divide :: (a -> (b, c)) -> f b -> f c -> f a
conquer :: f a
It turns out that my particular type does suit the definition of the Divisible type class. While Elm does not support type classes, I do look at Haskell from time to time for some inspiration.
My question: Are there any practical uses for this type class? Are there known APIs out there in Haskell (or other languages) that benefit from this divide-conquer pattern? Are there any gotchas I should be aware of?
Thank you very much for your help.
One example:
Applicative is useful for parsing, because you can turn Applicative parsers of parts into a parser of wholes, needing only a pure function for combining the parts into a whole.
Divisible is useful for serializing (should we call this coparsing now?), because you can turn Divisible serializers of parts into a serializer of wholes, needing only a pure function for splitting the whole into parts.
I haven't actually seen a project that worked this way, but I'm (slowly) working on an Avro implementation for Haskell that does.
When I first came across Divisible I wanted it for divide, and had no idea what possible use conquer could be other than cheating (an f a out of nowhere, for any a?). But to make the Divisible laws check out for my serializers conquer became a "serializer" that encodes anything to zero bytes, which makes a lot of sense.
Here's a possible use case.
In streaming libraries, one can have fold-like constructs like the ones from the foldl package, that are fed a sequence of inputs and return a summary value when the sequence is exhausted.
These folds are contravariant on their inputs, and can be made Divisible. This means that if you have a stream of elements where each element can be somehow decomposed into b and c parts, and you also happen to have a fold that consumes bs and another fold that consumes cs, then you can build a fold that consumes the original stream.
The actual folds from foldl don't implement Divisible, but they could, using a newtype wrapper. In my process-streaming package I have a fold-like type that does implement Divisible.
divide requires the return values of the constituent folds to be of the same type, and that type must be an instance of Monoid. If the folds return different, unrelated monoids, a workaround is to put each return value in a separate field of a tuple, leaving the other field as mempty. This works because a tuple of monoids is itself a Monoid.
I'll examine the example of the core data types in Fritz Henglein's generalized radix sort techniques as implemented by Edward Kmett in the discrimination package.
While there's a great deal going on there, it largely focuses around a type like this
data Group a = Group (forall b . [(a, b)] -> [[b]])
If you have a value of type Group a you essentially must have an equivalence relationship on a because if I give you an association between as and some type b completely unknown to you then you can give me "groupings" of b.
groupId :: Group a -> [a] -> [[a]]
groupId (Group grouper) = grouper . map (\a -> (a, a))
You can see this as a core type for writing a utility library of groupings. For instance, we might want to know that if we can Group a and Group b then we can Group (a, b) (more on this in a second). Henglein's core idea is that if you can start with some basic Groups on integers—we can write very fast Group Int32 implementations via radix sort—and then use combinators to extend them over all types then you will have generalized radix sort to algebraic data types.
So how might we build our combinator library?
Well, f :: Group a -> Group b -> Group (a, b) is pretty important in that it lets us make groups of product-like types. Normally, we'd get this from Applicative and liftA2 but Group, you'll notice, is Contravaiant, not a Functor.
So instead we use Divisible
divided :: Group a -> Group b -> Group (a, b)
Notice that this arises in a strange way from
divide :: (a -> (b, c)) -> Group b -> Group c -> Group a
as it has the typical "reversed arrow" character of contravariant things. We can now understand things like divide and conquer in terms of their interpretation on Group.
Divide says that if I want to build a strategy for equating as using strategies for equating bs and cs, I can do the following for any type x
Take your partial relation [(a, x)] and map over it with a function f :: a -> (b, c), and a little tuple manipulation, to get a new relation [(b, (c, x))].
Use my Group b to discriminate [(b, (c, x))] into [[(c, x)]]
Use my Group c to discriminate each [(c, x)] into [[x]] giving me [[[x]]]
Flatten the inner layers to get [[x]] like we need
instance Divisible Group where
conquer = Group $ return . fmap snd
divide k (Group l) (Group r) = Group $ \xs ->
-- a bit more cleverly done here...
l [ (b, (c, d)) | (a,d) <- xs, let (b, c) = k a] >>= r
We also get interpretations of the more tricky Decidable refinement of Divisible
class Divisible f => Decidable f where
lose :: (a -> Void) -> f a
choose :: (a -> Either b c) -> f b -> f c -> f a
instance Decidable Group where
lose :: (a -> Void) -> Group a
choose :: (a -> Either b c) -> Group b -> Group c -> Group a
These read as saying that for any type a of which we can guarantee there are no values (we cannot produce values of Void by any means, a function a -> Void is a means of producing Void given a, thus we must not be able to produce values of a by any means either!) then we immediately get a grouping of zero values
lose _ = Group (\_ -> [])
We also can go a similar game as to divide above except instead of sequencing our use of the input discriminators, we alternate.
Using these techniques we build up a library of "Groupable" things, namely Grouping
class Grouping a where
grouping :: Group a
and note that nearly all the definitions arise from the basic definition atop groupingNat which uses fast monadic vector manipuations to achieve an efficient radix sort.

Why context is not considered when selecting typeclass instance in Haskell?

I understand that when having
instance (Foo a) => Bar a
instance (Xyy a) => Bar a
GHC doesn't consider the contexts, and the instances are reported as duplicate.
What is counterintuitive, that (I guess) after selecting an instance, it still needs to check if the context matches, and if not, discard the instance. So why not reverse the order, and discard instances with non-matching contexts, and proceed with the remaining set.
Would this be intractable in some way? I see how it could cause more constraint resolution work upfront, but just as there is UndecidableInstances / IncoherentInstances, couldn't there be a ConsiderInstanceContexts when "I know what I am doing"?
This breaks the open-world assumption. Assume:
class B1 a
class B2 a
class T a
If we allow constraints to disambiguate instances, we may write
instance B1 a => T a
instance B2 a => T a
And may write
instance B1 Int
Now, if I have
f :: T a => a
Then f :: Int works. But, the open world assumption says that, once something works, adding more instances cannot break it. Our new system doesn't obey:
instance B2 Int
will make f :: Int ambiguous. Which implementation of T should be used?
Another way to state this is that you've broken coherence. For typeclasses to be coherent means that there is only one way to satisfy a given constraint. In normal Haskell, a constraint c has only one implementation. Even with overlapping instances, coherence generally holds true. The idea is that instance T a and instance {-# OVERLAPPING #-} T Int do not break coherence, because GHC can't be tricked into using the former instance in a place where the latter would do. (You can trick it with orphans, but you shouldn't.) Coherence, at least to me, seems somewhat desirable. Typeclass usage is "hidden", in some sense, and it makes sense to enforce that it be unambiguous. You can also break coherence with IncoherentInstances and/or unsafeCoerce, but, y'know.
In a category theoretic way, the category Constraint is thin: there is at most one instance/arrow from one Constraint to another. We first construct two arrows a : () => B1 Int and b : () => B2 Int, and then we break thinness by adding new arrows x_Int : B1 Int => T Int, y_Int : B2 Int => T Int such that x_Int . a and y_Int . b are both arrows () => T Int that are not identical. Diamond problem, anyone?
This does not answer you question as to why this is the case. Note, however, that you can always define a newtype wrapper to disambiguate between the two instances:
newtype FooWrapper a = FooWrapper a
newtype XyyWrapper a = XyyWrapper a
instance (Foo a) => Bar (FooWrapper a)
instance (Xyy a) => Bar (XyyWrapper a)
This has the added advantage that by passing around either a FooWrapper or a XyyWrapper you explicitly control which of the two instances you'd like to use if your a happens to satisfy both.
Classes are a bit weird. The original idea (which still pretty much works) is a sort of syntactic sugar around what would otherwise be data statements. For example you can imagine:
data Num a = Num {plus :: a -> a -> a, ... , fromInt :: Integer -> a}
numInteger :: Num Integer
numInteger = Num (+) ... id
then you can write functions which have e.g. type:
test :: Num x -> x -> x -> x -> x
test lib a b c = a + b * (abs (c + b))
where (+) = plus lib
(*) = times lib
abs = absoluteValue lib
So the idea is "we're going to automatically derive all of this library code." The question is, how do we find the library that we want? It's easy if we have a library of type Num Int, but how do we extend it to "constrained instances" based on functions of type:
fooLib :: Foo x -> Bar x
xyyLib :: Xyy x -> Bar x
The present solution in Haskell is to do a type-pattern-match on the output-types of those functions and propagate the inputs to the resulting declaration. But when there's two outputs of the same type, we would need a combinator which merges these into:
eitherLib :: Either (Foo x) (Xyy x) -> Bar x
and basically the problem is that there is no good constraint-combinator of this kind right now. That's your objection.
Well, that's true, but there are ways to achieve something morally similar in practice. Suppose we define some functions with types:
data F
data X
foobar'lib :: Foo x -> Bar' x F
xyybar'lib :: Xyy x -> Bar' x X
bar'barlib :: Bar' x y -> Bar x
Clearly the y is a sort of "phantom type" threaded through all of this, but it remains powerful because given that we want a Bar x we will propagate the need for a Bar' x y and given the need for the Bar' x y we will generate either a Bar' x X or a Bar' x y. So with phantom types and multi-parameter type classes, we get the result we want.
More info: https://www.haskell.org/haskellwiki/GHC/AdvancedOverlap
Adding backtracking would make instance resolution require exponential time, in the worst case.
Essentially, instances become logical statements of the form
P(x) => R(f(x)) /\ Q(x) => R(f(x))
which is equivalent to
(P(x) \/ Q(x)) => R(f(x))
Computationally, the cost of this check is (in the worst case)
c_R(n) = c_P(n-1) + c_Q(n-1)
assuming P and Q have similar costs
c_R(n) = 2 * c_PQ(n-1)
which leads to exponential growth.
To avoid this issue, it is important to have fast ways to choose a branch, i.e. to have clauses of the form
((fastP(x) /\ P(x)) \/ (fastQ(x) /\ Q(x))) => R(f(x))
where fastP and fastQ are computable in constant time, and are incompatible so that at most one branch needs to be visited.
Haskell decided that this "fast check" is head compatibility (hence disregarding contexts). It could use other fast checks, of course -- it's a design decision.

Using a lens twice

I'm struggling with using the lens library for a particular problem. I'm trying to pass
an updated data structure
a lens focussed on part of that updated structure
to another function, g. I pass both the lens and the data structure because g needs some shared information from the data structure as well as a piece of information. (If it helps, the data structure contains information on a joint probability distribution, but g only works on either marginal and needs to know which marginal I'm looking at. The only difference between the two marginals is their mean with the rest of their definition being shared in the data structure).
My first attempt looked like this
f :: Functor f => Params -> ((Double -> f Double) -> Params -> f Params) -> a
f p l = g (l %~ upd $ p) l
where upd = ...
g p x = go p p^.x
but that fails during compilation because f gets inferred as being Identity for the update and Const Double for the getter.
What's the best way to accomplish what I want to do? I can imagine being able to do one of the following:
make a copy of the lens so that the type inference can be different in each case
rather than passing the updated structure and the lens, I pass the original structure and a lens which returns a modified value (if I only want to update the part of the structure that the lens looks at).
making a better design choice for my functions/data structure
something completely different
Thanks for any help!
András Kovács answer shows how to achieve this with RankNTypes. If you wish to avoid RankNTypes, then you can use ALens and cloneLens:
f :: a -> ALens' a Int -> (Int, a)
f a l = (newvalue, a & cloneLens l .~ newvalue)
where oldvalue = a^.cloneLens l
newvalue = if oldvalue == 0 then 0 else oldvalue - 1
Control.Lens.Loupe provides operators and functions that work on ALens instead of Lens.
Note that in many cases, you should also be able to use <<%~, which is like %~ but also returns the old value, or <%~, which returns the new value:
f :: a -> LensLike' ((,) Int) a Int -> (Int, a)
f a l = a & l <%~ g
where g oldvalue = if oldvalue == 0 then 0 else oldvalue - 1
This has the advantage that it can also work with Isos or sometimes also with Traversals (when the target type is a Monoid).
You want your type signature to look like this:
f :: Params -> Lens Params Params Double Double -> ...
-- alternatively, instead of the long Lens form you can write
-- Lens' Params Double
This is not equivalent to what you wrote out in the signature, because the functor parameter is quantified inside Lens:
type Lens s t a b = forall f. Functor f => (a -> f b) -> (s -> f t)
The correct signature translates to :
f :: Params -> (forall f. Functor f => (Double -> f Double) -> Params -> f Params) -> ...
This prevents the compiler from unifying the different f parameters of different lens usages, i. e. you can use the lens polymorphically. Note that you need the RankNTypes or Rank2Types GHC extension in order to be able to write out the signature.
Benno gave the best general purpose answer.
There is two other options, however, which I offer here for completeness.
1.)
There are several Loupe combinators in Lens.
http://hackage.haskell.org/package/lens-4.1.2/docs/Control-Lens-Loupe.html
They all have names that involve #.
^# and #%= both take ALens which is a lens instantiated at a particular concrete choice of functor.
This can be useful if you need to pass around lists of lenses, or if you really really need multiple passes.
2.)
Another option, and my preferred tactic, is to figure out how to do both operations a the same time.
Here you are modifying, but want the value you just set. Well, yes can give you that by using <%~ instead of %~.
Now you only instantiate the lens at one choice of functor and your code gets faster.

Haskell newbie on types

I'm completely new to Haskell (and more generally to functional programming), so forgive me if this is really basic stuff. To get more than a taste, I try to implement in Haskell some algorithmic stuff I'm working on. I have a simple module Interval that implements intervals on the line. It contains the type
data Interval t = Interval t t
the helper function
makeInterval :: (Ord t) => t -> t -> Interval t
makeInterval l r | l <= r = Interval l r
| otherwise = error "bad interval"
and some utility functions about intervals.
Here, my interest lies in multidimensional intervals (d-intervals), those objects that are composed of d intervals. I want to separately consider d-intervals that are the union of d disjoint intervals on the line (multiple interval) from those that are the union of d interval on d separate lines (track interval). With distinct algorithmic treatments in mind, I think it would be nice to have two distinct types (even if both are lists of intervals here) such as
import qualified Interval as I
-- Multilple interval
newtype MInterval t = MInterval [I.Interval t]
-- Track interval
newtype TInterval t = TInterval [I.Interval t]
to allow for distinct sanity checks, e.g.
makeMInterval :: (Ord t) => [I.Interval t] -> MInterval t
makeMInterval is = if foldr (&&) True [I.precedes i i' | (i, i') <- zip is (tail is)]
then (MInterval is)
else error "bad multiple interval"
makeTInterval :: (Ord t) => [I.Interval t] -> TInterval t
makeTInterval = TInterval
I now get to the point, at last! But some functions are naturally concerned with both multiple intervals and track intervals. For example, a function order would return the number of intervals in a multiple interval or a track interval. What can I do? Adding
-- Dimensional interval
data DInterval t = MIntervalStuff (MInterval t) | TIntervalStuff (TInterval t)
does not help much, since, if I understand well (correct me if I'm wrong), I would have to write
order :: DInterval t -> Int
order (MIntervalStuff (MInterval is)) = length is
order (TIntervalStuff (TInterval is)) = length is
and call order as order (MIntervalStuff is) or order (TIntervalStuff is) when is is a MInterval or a TInterval. Not that great, it looks odd. Neither I want to duplicate the function (I have many functions that are concerned with both multiple and track intevals, and some other d-interval definitions such as equal length multiple and track intervals).
I'm left with the feeling that I'm completely wrong and have missed some important point about types in Haskell (and/or can't forget enough here about OO programming). So, quite a newbie question, what would be the best way in Haskell to deal with such a situation? Do I have to forget about introducing MInterval and TInterval and go with one type only?
Thanks a lot for your help,
Garulfo
Edit: this is the same idea as sclv's answer; his links provide more information on this technique.
What about this approach?
data MInterval = MInterval --multiple interval
data TInterval = TInterval --track interval
data DInterval s t = DInterval [I.Interval t]
makeMInterval :: (Ord t) => [I.Interval t] -> Maybe (DInterval MInterval t)
makeMInterval is = if foldr (&&) True [I.precedes i i' | (i, i') <- zip is (tail is)]
then Just (DInterval is)
else Nothing
order :: DInterval s t -> Int
order (DInterval is) = length is
equalOrder :: DInterval s1 t -> DInterval s2 t -> Bool
equalOrder i1 i2 = order i1 == order i2
addToMInterval :: DInterval MInterval t -> Interval t -> Maybe (DInterval MInterval t)
addToMInterval = ..
Here the type DInterval represents multi-dimension intervals, but it takes an extra type parameter as a phantom type. This extra type information allows the typechecker to differentiate between different kinds of intervals even though they have exactly the same representation.
You get the type safety of your original design, but all your structures share the same implementation. Crucially, when the type of the interval doesn't matter, you can leave it unspecified.
I also changed the implementation of your makeMInterval function; returning a Maybe for a function like this is much more idiomatic than calling error.
More explanation on Maybe:
Let's examine your function makeInterval. This function is supposed to take a list of Interval's, and if they meet the criteria return a multiple interval, otherwise a track interval. This explanation leads to the type:
makeInterval :: (Ord t) =>
[I.Interval t] ->
Either (DInterval TInterval t) (DInterval MInterval t)
Now that we have the type, how can we implement it? We'd like to re-use our makeMInterval function.
makeInterval is = maybe
(Left $ DInterval TInterval is)
Right
(makeMInterval is)
The function maybe takes three arguments: a default b to use if the Maybe is Nothing, a function a -> b if the Maybe is Just a, and a Maybe a. It returns either the default or the result of applying the function to the Maybe value.
Our default is the track interval, so we create a Left track interval for the first argument. If the maybe is Just (DInterval MInterval t), the multiple interval already exists so all that's necessary is to stick it into the Right side of the either. Finally, makeMInterval is used to create the multiple interval.
What you want are types with the same structure and common operations, but which may be distinguished, and for which certain operations only make sense on one or another type. The idiom for this in Haskell is Phantom Types.
http://www.haskell.org/haskellwiki/Phantom_type
http://www.haskell.org/haskellwiki/Wrapper_types
http://blog.malde.org/index.php/2009/05/14/using-a-phantom-type-to-label-different-kinds-of-sequences/
http://neilmitchell.blogspot.com/2007/04/phantom-types-for-real-problems.html
What you want is a typeclass. Typeclasses is how overloading is done in Haskell. A type class is a common structure that is shared by many types. In your case you want to create a class of all types that have a order function:
class Dimensional a where
order :: a -> Int
This is saying there's a class of types a called Dimensional, and for each type belonging to this class there's a function a -> Int called order. Now you need to declare all your types as an belonging to the class:
instance Dimensional MInterval where
order (MInterval is) = length is
and so on. You can also declare functions that act on things that are from a given class, this way:
isOneDimensional :: (Dimensional a) => a -> Bool
isOneDimensional interval = (order interval == 1)
The type declaration can be read "for all types a that belong to typeclass Dimensional, this functions takes a and returns a Bool".
I like to think about classes as algebraic structures in mathematics (it's not quite the same thing as you can't enforce some logical constraints, but...): you state what are the requirements for a type to belong to some class (in the analogy, the operations must be defined over a set for it to belong to a certain algebraic construction), and then, for each specific type you state what is the specific implementation of the required functions (in the analogy, what are the specific functions for each specific set).
Take, for example, a group. Groups must have an identity, an inverse for each element and a multiplication:
class Group a where:
identity :: a
inverse :: a -> a
(<*>) :: a -> a -> a
and Integers form a group under addition:
instance Group Integer where:
identity = 0
inverse x = (- x)
x <*> y = x + y
Note that this will not solve the repetition, as you have to instantiate each type.
consider the use of typeclasses to show similarities between different types. See the haskell tutorial[1] about classes and overloading for help.
[1]: http://www.haskell.org/tutorial/classes.html haskell tutorial

Resources