Haskell newbie on types - haskell

I'm completely new to Haskell (and more generally to functional programming), so forgive me if this is really basic stuff. To get more than a taste, I try to implement in Haskell some algorithmic stuff I'm working on. I have a simple module Interval that implements intervals on the line. It contains the type
data Interval t = Interval t t
the helper function
makeInterval :: (Ord t) => t -> t -> Interval t
makeInterval l r | l <= r = Interval l r
| otherwise = error "bad interval"
and some utility functions about intervals.
Here, my interest lies in multidimensional intervals (d-intervals), those objects that are composed of d intervals. I want to separately consider d-intervals that are the union of d disjoint intervals on the line (multiple interval) from those that are the union of d interval on d separate lines (track interval). With distinct algorithmic treatments in mind, I think it would be nice to have two distinct types (even if both are lists of intervals here) such as
import qualified Interval as I
-- Multilple interval
newtype MInterval t = MInterval [I.Interval t]
-- Track interval
newtype TInterval t = TInterval [I.Interval t]
to allow for distinct sanity checks, e.g.
makeMInterval :: (Ord t) => [I.Interval t] -> MInterval t
makeMInterval is = if foldr (&&) True [I.precedes i i' | (i, i') <- zip is (tail is)]
then (MInterval is)
else error "bad multiple interval"
makeTInterval :: (Ord t) => [I.Interval t] -> TInterval t
makeTInterval = TInterval
I now get to the point, at last! But some functions are naturally concerned with both multiple intervals and track intervals. For example, a function order would return the number of intervals in a multiple interval or a track interval. What can I do? Adding
-- Dimensional interval
data DInterval t = MIntervalStuff (MInterval t) | TIntervalStuff (TInterval t)
does not help much, since, if I understand well (correct me if I'm wrong), I would have to write
order :: DInterval t -> Int
order (MIntervalStuff (MInterval is)) = length is
order (TIntervalStuff (TInterval is)) = length is
and call order as order (MIntervalStuff is) or order (TIntervalStuff is) when is is a MInterval or a TInterval. Not that great, it looks odd. Neither I want to duplicate the function (I have many functions that are concerned with both multiple and track intevals, and some other d-interval definitions such as equal length multiple and track intervals).
I'm left with the feeling that I'm completely wrong and have missed some important point about types in Haskell (and/or can't forget enough here about OO programming). So, quite a newbie question, what would be the best way in Haskell to deal with such a situation? Do I have to forget about introducing MInterval and TInterval and go with one type only?
Thanks a lot for your help,
Garulfo

Edit: this is the same idea as sclv's answer; his links provide more information on this technique.
What about this approach?
data MInterval = MInterval --multiple interval
data TInterval = TInterval --track interval
data DInterval s t = DInterval [I.Interval t]
makeMInterval :: (Ord t) => [I.Interval t] -> Maybe (DInterval MInterval t)
makeMInterval is = if foldr (&&) True [I.precedes i i' | (i, i') <- zip is (tail is)]
then Just (DInterval is)
else Nothing
order :: DInterval s t -> Int
order (DInterval is) = length is
equalOrder :: DInterval s1 t -> DInterval s2 t -> Bool
equalOrder i1 i2 = order i1 == order i2
addToMInterval :: DInterval MInterval t -> Interval t -> Maybe (DInterval MInterval t)
addToMInterval = ..
Here the type DInterval represents multi-dimension intervals, but it takes an extra type parameter as a phantom type. This extra type information allows the typechecker to differentiate between different kinds of intervals even though they have exactly the same representation.
You get the type safety of your original design, but all your structures share the same implementation. Crucially, when the type of the interval doesn't matter, you can leave it unspecified.
I also changed the implementation of your makeMInterval function; returning a Maybe for a function like this is much more idiomatic than calling error.
More explanation on Maybe:
Let's examine your function makeInterval. This function is supposed to take a list of Interval's, and if they meet the criteria return a multiple interval, otherwise a track interval. This explanation leads to the type:
makeInterval :: (Ord t) =>
[I.Interval t] ->
Either (DInterval TInterval t) (DInterval MInterval t)
Now that we have the type, how can we implement it? We'd like to re-use our makeMInterval function.
makeInterval is = maybe
(Left $ DInterval TInterval is)
Right
(makeMInterval is)
The function maybe takes three arguments: a default b to use if the Maybe is Nothing, a function a -> b if the Maybe is Just a, and a Maybe a. It returns either the default or the result of applying the function to the Maybe value.
Our default is the track interval, so we create a Left track interval for the first argument. If the maybe is Just (DInterval MInterval t), the multiple interval already exists so all that's necessary is to stick it into the Right side of the either. Finally, makeMInterval is used to create the multiple interval.

What you want are types with the same structure and common operations, but which may be distinguished, and for which certain operations only make sense on one or another type. The idiom for this in Haskell is Phantom Types.
http://www.haskell.org/haskellwiki/Phantom_type
http://www.haskell.org/haskellwiki/Wrapper_types
http://blog.malde.org/index.php/2009/05/14/using-a-phantom-type-to-label-different-kinds-of-sequences/
http://neilmitchell.blogspot.com/2007/04/phantom-types-for-real-problems.html

What you want is a typeclass. Typeclasses is how overloading is done in Haskell. A type class is a common structure that is shared by many types. In your case you want to create a class of all types that have a order function:
class Dimensional a where
order :: a -> Int
This is saying there's a class of types a called Dimensional, and for each type belonging to this class there's a function a -> Int called order. Now you need to declare all your types as an belonging to the class:
instance Dimensional MInterval where
order (MInterval is) = length is
and so on. You can also declare functions that act on things that are from a given class, this way:
isOneDimensional :: (Dimensional a) => a -> Bool
isOneDimensional interval = (order interval == 1)
The type declaration can be read "for all types a that belong to typeclass Dimensional, this functions takes a and returns a Bool".
I like to think about classes as algebraic structures in mathematics (it's not quite the same thing as you can't enforce some logical constraints, but...): you state what are the requirements for a type to belong to some class (in the analogy, the operations must be defined over a set for it to belong to a certain algebraic construction), and then, for each specific type you state what is the specific implementation of the required functions (in the analogy, what are the specific functions for each specific set).
Take, for example, a group. Groups must have an identity, an inverse for each element and a multiplication:
class Group a where:
identity :: a
inverse :: a -> a
(<*>) :: a -> a -> a
and Integers form a group under addition:
instance Group Integer where:
identity = 0
inverse x = (- x)
x <*> y = x + y
Note that this will not solve the repetition, as you have to instantiate each type.

consider the use of typeclasses to show similarities between different types. See the haskell tutorial[1] about classes and overloading for help.
[1]: http://www.haskell.org/tutorial/classes.html haskell tutorial

Related

Writing modules in Haskell the right way

(I'm totally rewriting this question to give it a better focus; you can see the history of changes if you want to see the original.)
Let's say I have two modules:
One module defines the function inverseAndSqrt. What this function actually does is not important; what is important is that it returns none, one, or both of two things in a way that the client can distinguish which one is which;
module Module1 (inverseAndSqrt) where
type TwoOpts a = (Maybe a, Maybe a)
inverseAndSqrt :: Int -> TwoOpts Float
inverseAndSqrt x = (if x /= 0 then Just (1.0/(fromIntegral x)) else Nothing,
if x >= 0 then Just (sqrt $ fromIntegral x) else Nothing)
another module defines other functions depending on inverseAndSqrt and on its type
module Module2 where
import Module1
fun :: (Maybe Float, Maybe Float) -> Float
fun (Just x, Just y) = x + y
fun (Just x, Nothing) = x
fun (Nothing, Just y) = y
exportedFun :: Int -> Float
exportedFun = fun . inverseAndSqrt
What I want to understand from the perspective of design principle is: how should I interface Module1 with other modules (e.g. Module2) in a way that makes it well encapsulated, reusable, etc?
The problems I see are
I could one day decide that I don't want to use a pair to return the two results anymore; I could decide to use a 2 elements list; or another type which is isomorphic (I think this is the right adjective, isn't it?) to a pair; if I do this, all client code will break
Exporting the TwoOpts type synonym doesn't solve anything, as Module1 could still change its implementation thus breaking client code.
Module1 is also forcing the type of the two optionals to be the same, but I'm not sure this is really relevant to this question...
How should I design Module1 (and thus edit Module2 as well) such that the two are not tightly coupled?
One thing I can think of is that maybe I should define a typeclass expressing what "a box with two optional things in it" is, and then Module1 and Module2 would use that as a common interface. But should that be in both module? In either of them? Or in none of them, in a third module? Or maybe such a class/concept is not needed?
I'm not a computer scientist so I'm sure that this question highlights some misunderstanding of mine due to lack of experience and theoretical background. Any help filling the gaps is welcome.
Possible modifications I'd like to support
Related to what chepner suggested in a comment to his answer, at some point I might want to extend the support from 2-tuple things to both 2- and 3-tuple things, having different accessor names for them, suche as get1of2/get2of2 (let's say these are the name we use when we first design Module1) vs get1of3/get2of3/get3of3.
At some point I would also be able to complement this 2-tuple-like type with something else, for instance an optional containing Just the sum¹ of the two main contents only if they are both Justs, or a Nothing if at least one of the two main contents is a Nothing. I guess in this case the internal representation of this class would be something like ((Maybe a, Maybe a), Maybe b) (¹ The sum is really a stupid example, so I've used b here instead of a to be more general than the sum would require).
To me, Haskell design is all type-centric. The design rule for functions is just "use the most general and accurate types that do the job", and the whole problem of design in Haskell is about coming up with the best types for the job.
We would like there to be no "junk" in the types, so that they have exactly one representation for each value you want to denote. E.g. String is a bad representation for numbers, because "0", "0.0", "-0" all mean the same thing, and also because "The Prisoner" is not a number -- it is a valid representation that does not have a valid denotation. If, say for performance reasons, the same denotation can be represented multiple ways, the type's API should make that difference invisible to the user.
So in your case, (Maybe a, Maybe a) is perfect -- it means exactly what you need it to mean. Using something more complicated is unnecessary, and will just complicate matters for the user. At some point whatever you expose will have to be convertible to a Maybe a for the first thing and a Maybe a for the second thing, and there is no extra information than that, so the tuple is perfect. Whether you use a type synonym or not is a matter of style -- I prefer not use synonyms at all and only give types names when I have a more formal abstraction in mind.
Connotation is important. For example, if I had a function for finding the roots of a quadratic polynomial, I probably wouldn't use TwoOpts, even though there are at most two of them. The fact that my return values are all "the same kind of thing" in an intuitive sense makes me prefer a list (or if I'm feeling particularly picky, a Set or Bag), even if the list has at most two elements. I just have it match my best understanding of the domain at the time, so I won't change it unless my understanding of the domain has changed in a significant way, in which case the opportunity to review all its uses is exactly what I want. If you are writing your functions to be as polymorphic as possible, then often you won't need to change anything but the specific moments the meaning is used, the exact moment domain knowledge is required (such as understanding the relationship between TwoOpts and Set). You don't need to "redo the plumbing" if it's made of a sufficiently flexible, polymorphic material.
Supposing you didn't have a clean isomorphism to a standard type like (Maybe a, Maybe a), and you wanted to formalize TwoOpts. The way here is to build an API out of its constructors, combinators, and eliminators. For example:
data TwoOpts a -- abstract, not exposed
-- constructors
none :: TwoOpts a
justLeft :: a -> TwoOpts a
justRight :: a -> TwoOpts a
both :: a -> a -> TwoOpts a
-- combinators
-- Semigroup and Monoid at least
swap :: TwoOpts a -> TwoOpts a
-- eliminators
getLeft :: TwoOpts a -> Maybe a
getRight :: TwoOpts a -> Maybe a
In this case the eliminators give exactly your representation (Maybe a, Maybe a) as their final coalgebra.
-- same as the tuple in a newtype, just more conventional
data TwoOpts a = TwoOpts (Maybe a) (Maybe a)
Or if you wanted to focus on the constructors side you could use an initial algebra
data TwoOpts a
= None
| JustLeft a
| JustRight a
| Both a a
You are at liberty to change this representation as long as it still implements the combinatory API above. If you have reason to use different representations of the same API, make the API into a typeclass (typeclass design is a whole other story).
In Einstein's famous words, "make it as simple as possible, but no simpler".
Don't define a simple type alias; this exposes the details of how you implement TwoOpts.
Instead, define a new type, but don't export the data constructor, but rather functions for accessing the two components. Then you are free to change the implementation of the type all you like without changing the interface, because the user can't pattern-match on a value of type TwoOpts a.
module Module1 (TwoOpts, inverseAndSqrt, getFirstOpt, getSecondOpt) where
data TwoOpts a = TwoOpts (Maybe a) (Maybe a)
getFirstOpt, getSecondOpt :: TwoOpts a -> Maybe a
getFirstOpt (TwoOpts a _) = a
getSecondOpt (TwoOpts _ b) = b
inverseAndSqrt :: Int -> TwoOpts Float
inverseAndSqrt x = TwoOpts (safeInverse x) (safeSqrt x)
where safeInverse 0 = Nothing
safeInverse x = Just (1.0 / fromIntegral x)
safeSqrt x | x >= 0 = Just $ sqrt $ fromIntegral x
| otherwise = Nothing
and
module Module2 where
import Module1
fun :: TwoOpts Float -> Float
fun a = case (getFirstOpts a, getSecondOpt a) of
(Just x, Just y) -> x + y
(Just x, Nothing) -> x
(Nothing, Just y) -> y
exportedFun :: Int -> Float
exportedFun = fun . inverseAndSqrt
Later, when you realize that you've reimplemented the type product, you can change your definitions without affecting any user code.
newtype TwoOpts a = TwoOpts { getOpts :: (Maybe a, Maybe a) }
getFirstOpt, getSecondOpt :: TwoOpts a -> Maybe a
getFirstOpt = fst . getOpts
getSecondOpt = snd . getOpts

Design pattern for subclass-like structure

My goal is to represent a set of types with a similar behaviour in a elegant and performant manner. To achieve this, I have created a solution that utilises a single type, followed by a set of functions that perform pattern matching.
My first question is: is there a way how to represent the same ideas using a single type-class and instead of having a constructor per each variation to have a type that implements said type-class?
Which of the two approaches below is:
- a better recognised design pattern in Haskell?
- more memory efficient?
- more performant?
- more elegant and why?
- easier to use for consumers of the code?
Approach 1: Single type and pattern matching
Suppose there is a following structure:
data Aggregate a
= Average <some necessary state keeping>
| Variance <some necessary state keeping>
| Quantile a <some necessary state keeping>
It's constructors are not public as that would expose the internal state keeping. Instead, a set of constructor functions exist:
newAverage :: Floating a
=> Aggregate a
newAverage = Average ...
newVariance :: Floating a
=> Aggregate a
newVariance = Variance ...
newQuantile :: Floating a
=> a -- ! important, a parameter to the function
-> Aggregate a
newQuantile p = Quantile p ...
Once the object is created, we can perform two functions: put values into it, and once we are satisfied, we can get the current value:
get :: Floating a
=> Aggregate a
-> Maybe a
get (Average <state>) = getAverage <state>
get (Variance <state>) = getVariance <state>
get (Quantile _ <state>) = getQuantile <state>
put :: Floating a
=> a
-> Aggregate a
-> Aggregate a
put newVal (Average <state>) = putAverage newVal <state>
put newVal (Variance <state>) = putVariance newVal <state>
put newVal (Quantile p <state>) = putQuantile newVal p <state>
Approach 2: Type-classes and instances
class Aggregate a where
new :: a
get :: Floating f => a f -> Maybe f
put :: Floating f =>
data Average a = Average Word64 a
data Variance a ...
instance Aggregate Average where
instance Aggregate Variance where
instance Aggregate Quantile where
The obvious problem here is the fact that new is not parametric and thus Quantile can't be initialised with the p parameter. Adding a parameter to new is possible, but it would result in all other non-parametric constructors to ignore the value, which is not a good design.
You are missing the "codata" encoding, which sounds like it might be the best fit for your problem.
data Aggregate a = Aggregate
{ get :: Maybe a
, put :: a -> Aggregate a
}
-- Use the closure to keep track of local state.
newAverage :: (Floating a) => Aggregate a
newAverage = Aggregate { get = Nothing, put = go 0 0 }
where
go n total x = Aggregate { get = Just ((total + x) / (n+1))
, put = go (n+1) (total+x)
}
-- Parameters are not a problem.
newQuantile :: (Floating a) => a -> Aggregate a
newQuantile p = Aggregate { get = ... ; put = \x -> ... }
...
For some reason this approach always slips under the radar of people with OO backgrounds, which is strange because it is a pretty close match to that paradigm.
It's hard to give a general recommendation. I tend to prefer approach 1. Note that you could use
data Aggregate a
= Average AverageState
| Variance VarianceState
| Quantile a QuantileState
and export every constructor above, keeping only the ...State types private to the module.
This might be feasible in some contexts, but not in others, so it has to be evaluated on a case by case basis.
About approach 2, this could be more convenient if you have many constructors / types around. To fix the new problem, one could use type families (or fundeps) as in
class Floating f => Aggregate a f where
type AggregateNew a f
new :: AggregateNew a f -> a f
get :: a f -> Maybe f
put :: ...
instance Floating f => Aggregate Average f where
type AggregateNew (Average a) f = ()
new () = ...
instance Floating f => Aggregate Quantile f where
type AggregateNew (Quantile a) f = a
new x = ...
The naming above is horrible, but I used it to make the point. new takes an argument of type AggregateNew k f which can be () if new needs no information, or some more informative type when it is needed, like a for creating a Quantile.
There is a third approach for defining “aggregators” that neither requires an inextensible sum type nor multiple datatypes + a typeclass.
Approach 3: Single-constructor datatype that puts state behind an existential
Consider this type:
{-# LANGUAGE ExistentialQuantification #-}
data Fold a b = forall x. Fold (x -> a -> x) x (x -> b)
It represents an aggregator that ingests values of type a and eventually "returns" a value of type b while carrying an internal state x.
The constructor has type (x -> a -> x) -> x -> (x -> b) -> Fold a b. It takes a step function, an initial state and a final "summary" function. Notice that the state is decoupled from the return value b. They can be the same, but it's not required.
Also, the state is existentially quantified. We know the type of the state when we create a Fold, and we can work with it when we pattern-match on the Fold—in order to feed it data through the step function—but it's not reflected in the Fold's type. We can put Fold values with different inner states in the same container without problems, as long as they ingest and return the same types.
This pattern is sometimes called a “beautiful fold”. There is a library called foldl that is based on it, and provides may premade folds and utility functions.
The Fold type in foldl has many useful instances. In particular, the Applicative instance lets us create composite folds that still traverse the input data a single time, instead of requiring multiple passes.

Type variables in function signature

If I do the following
functionS (x,y) = y
:t functionS
functionS :: (a, b) -> b
Now with this function:
functionC x y = if (x > y) then True else False
:t function
I would expect to get:
functionC :: (Ord a, Ord b) => a -> b -> Bool
But I get:
functionC :: Ord a => a -> a -> Bool
GHCI seems to be ok with the 2 previous results, but why does it give me the second? Why the type variable a AND b aren't defined?
I think you might be misreading type signatures. Through no fault of your own––the examples you using to inform your thinking are kind of confusing. In particular, in your tuple example
functionS :: (a,b) -> b
functionS (x,y) = y
The notation (_,_) means two different things. In the first line, (a,b) refers to a type, the type of pairs whose first element has type a and second has type b. In the second line, (x,y) refers to a specfiic pair, where x has type a and y has type b. While this "pun" provides a useful mnemonic, it can be confusing as you are first getting the hang of it. I would rather that the type of pairs be a regular type constructor:
functionS :: Pair a b -> b
functionS (x,y) = y
So, moving on to your question. In the signature you are given
functionC :: Ord a => a -> a -> Bool
a is a type. Ord a says that elements of the type a are orderable with respect to each other. The function takes two arguments of the same type. Some types that are orderable are Integer (numerically), String (lexicographically), and a bunch of others. That means that you can tell which of two Integers is the smaller, or which of two Strings are the smaller. However we don't necessarily know how to tell whether an Integer is smaller than a String (and this is good! Have you seen what kinds of shenanigans javascript has to do to support untyped equality? Haskell doesn't have to solve this problem at all!). So that's what this signature is saying –– there is only one single orderable type, a, and the function takes two elements of this same type.
You might still be wondering why functionS's signature has two different type variables. It's because there is no constraint confining them to be the same, such as having to order them against each other. functionS works equally well with a pair where both components are integers as when one is an integer and the other is a string. It doesn't matter. And Haskell always picks the most general type that works. So if they are not forced to be the same, they will be different.
There are more technical ways to explain all this, but I felt an intuitive explanation was in order. I hope it's helpful!

Are there useful applications for the Divisible Type Class?

I've lately been working on an API in Elm where one of the main types is contravariant. So, I've googled around to see what one can do with contravariant types and found that the Contravariant package in Haskell defines the Divisible type class.
It is defined as follows:
class Contravariant f => Divisible f where
divide :: (a -> (b, c)) -> f b -> f c -> f a
conquer :: f a
It turns out that my particular type does suit the definition of the Divisible type class. While Elm does not support type classes, I do look at Haskell from time to time for some inspiration.
My question: Are there any practical uses for this type class? Are there known APIs out there in Haskell (or other languages) that benefit from this divide-conquer pattern? Are there any gotchas I should be aware of?
Thank you very much for your help.
One example:
Applicative is useful for parsing, because you can turn Applicative parsers of parts into a parser of wholes, needing only a pure function for combining the parts into a whole.
Divisible is useful for serializing (should we call this coparsing now?), because you can turn Divisible serializers of parts into a serializer of wholes, needing only a pure function for splitting the whole into parts.
I haven't actually seen a project that worked this way, but I'm (slowly) working on an Avro implementation for Haskell that does.
When I first came across Divisible I wanted it for divide, and had no idea what possible use conquer could be other than cheating (an f a out of nowhere, for any a?). But to make the Divisible laws check out for my serializers conquer became a "serializer" that encodes anything to zero bytes, which makes a lot of sense.
Here's a possible use case.
In streaming libraries, one can have fold-like constructs like the ones from the foldl package, that are fed a sequence of inputs and return a summary value when the sequence is exhausted.
These folds are contravariant on their inputs, and can be made Divisible. This means that if you have a stream of elements where each element can be somehow decomposed into b and c parts, and you also happen to have a fold that consumes bs and another fold that consumes cs, then you can build a fold that consumes the original stream.
The actual folds from foldl don't implement Divisible, but they could, using a newtype wrapper. In my process-streaming package I have a fold-like type that does implement Divisible.
divide requires the return values of the constituent folds to be of the same type, and that type must be an instance of Monoid. If the folds return different, unrelated monoids, a workaround is to put each return value in a separate field of a tuple, leaving the other field as mempty. This works because a tuple of monoids is itself a Monoid.
I'll examine the example of the core data types in Fritz Henglein's generalized radix sort techniques as implemented by Edward Kmett in the discrimination package.
While there's a great deal going on there, it largely focuses around a type like this
data Group a = Group (forall b . [(a, b)] -> [[b]])
If you have a value of type Group a you essentially must have an equivalence relationship on a because if I give you an association between as and some type b completely unknown to you then you can give me "groupings" of b.
groupId :: Group a -> [a] -> [[a]]
groupId (Group grouper) = grouper . map (\a -> (a, a))
You can see this as a core type for writing a utility library of groupings. For instance, we might want to know that if we can Group a and Group b then we can Group (a, b) (more on this in a second). Henglein's core idea is that if you can start with some basic Groups on integers—we can write very fast Group Int32 implementations via radix sort—and then use combinators to extend them over all types then you will have generalized radix sort to algebraic data types.
So how might we build our combinator library?
Well, f :: Group a -> Group b -> Group (a, b) is pretty important in that it lets us make groups of product-like types. Normally, we'd get this from Applicative and liftA2 but Group, you'll notice, is Contravaiant, not a Functor.
So instead we use Divisible
divided :: Group a -> Group b -> Group (a, b)
Notice that this arises in a strange way from
divide :: (a -> (b, c)) -> Group b -> Group c -> Group a
as it has the typical "reversed arrow" character of contravariant things. We can now understand things like divide and conquer in terms of their interpretation on Group.
Divide says that if I want to build a strategy for equating as using strategies for equating bs and cs, I can do the following for any type x
Take your partial relation [(a, x)] and map over it with a function f :: a -> (b, c), and a little tuple manipulation, to get a new relation [(b, (c, x))].
Use my Group b to discriminate [(b, (c, x))] into [[(c, x)]]
Use my Group c to discriminate each [(c, x)] into [[x]] giving me [[[x]]]
Flatten the inner layers to get [[x]] like we need
instance Divisible Group where
conquer = Group $ return . fmap snd
divide k (Group l) (Group r) = Group $ \xs ->
-- a bit more cleverly done here...
l [ (b, (c, d)) | (a,d) <- xs, let (b, c) = k a] >>= r
We also get interpretations of the more tricky Decidable refinement of Divisible
class Divisible f => Decidable f where
lose :: (a -> Void) -> f a
choose :: (a -> Either b c) -> f b -> f c -> f a
instance Decidable Group where
lose :: (a -> Void) -> Group a
choose :: (a -> Either b c) -> Group b -> Group c -> Group a
These read as saying that for any type a of which we can guarantee there are no values (we cannot produce values of Void by any means, a function a -> Void is a means of producing Void given a, thus we must not be able to produce values of a by any means either!) then we immediately get a grouping of zero values
lose _ = Group (\_ -> [])
We also can go a similar game as to divide above except instead of sequencing our use of the input discriminators, we alternate.
Using these techniques we build up a library of "Groupable" things, namely Grouping
class Grouping a where
grouping :: Group a
and note that nearly all the definitions arise from the basic definition atop groupingNat which uses fast monadic vector manipuations to achieve an efficient radix sort.

Typed FP: Tuple Arguments and Curriable Arguments

In statically typed functional programming languages, like Standard ML, F#, OCaml and Haskell, a function will usually be written with the parameters separated from each other and from the function name simply by whitespace:
let add a b =
a + b
The type here being "int -> (int -> int)", i.e. a function that takes an int and returns a function which its turn takes and int and which finally returns an int. Thus currying becomes possible.
It's also possible to define a similar function that takes a tuple as an argument:
let add(a, b) =
a + b
The type becomes "(int * int) -> int" in this case.
From the point of view of language design, is there any reason why one could not simply identify these two type patterns in the type algebra? In other words, so that "(a * b) -> c" reduces to "a -> (b -> c)", allowing both variants to be curried with equal ease.
I assume this question must have cropped up when languages like the four I mentioned were designed. So does anyone know any reason or research indicating why all four of these languages chose not to "unify" these two type patterns?
I think the consensus today is to handle multiple arguments by currying (the a -> b -> c form) and to reserve tuples for when you genuinely want values of tuple type (in lists and so on). This consensus is respected by every statically typed functional language since Standard ML, which (purely as a matter of convention) uses tuples for functions that take multiple arguments.
Why is this so? Standard ML is the oldest of these languages, and when people were first writing ML compilers, it was not known how to handle curried arguments efficiently. (At the root of the problem is the fact that any curried function could be partially applied by some other code you haven't seen yet, and you have to compile it with that possibility in mind.) Since Standard ML was designed, compilers have improved, and you can read about the state of the art in an excellent paper by Simon Marlow and Simon Peyton Jones.
LISP, which is the oldest functional language of them all, has a concrete syntax which is extremely unfriendly to currying and partial application. Hrmph.
At least one reason not to conflate curried and uncurried functional types is that it would be very confusing when tuples are used as returned values (a convenient way in these typed languages to return multiple values). In the conflated type system, how can function remain nicely composable? Would a -> (a * a) also transform to a -> a -> a? If so, are (a * a) -> a and a -> (a * a) the same type? If not, how would you compose a -> (a * a) with (a * a) -> a?
You propose an interesting solution to the composition issue. However, I don't find it satisfying because it doesn't mesh well with partial application, which is really a key convenience of curried functions. Let me attempt to illustrate the problem in Haskell:
apply f a b = f a b
vecSum (a1,a2) (b1,b2) = (a1+b1,a2+b2)
Now, perhaps your solution could understand map (vecSum (1,1)) [(0,1)], but what about the equivalent map (apply vecSum (1,1)) [(0,1)]? It becomes complicated! Does your fullest unpacking mean that the (1,1) is unpacked with apply's a & b arguments? That's not the intent... and in any case, reasoning becomes complicated.
In short, I think it would be very hard to conflate curried and uncurried functions while (1) preserving the semantics of code valid under the old system and (2) providing a reasonable intuition and algorithm for the conflated system. It's an interesting problem, though.
Possibly because it is useful to have a tuple as a separate type, and it is good to keep different types separate?
In Haskell at least, it is quite easy to go from one form to the other:
Prelude> let add1 a b = a+b
Prelude> let add2 (a,b) = a+b
Prelude> :t (uncurry add1)
(uncurry add1) :: (Num a) => (a, a) -> a
Prelude> :t (curry add2)
(curry add2) :: (Num a) => a -> a -> a
so uncurry add1 is same as add2 and curry add2 is the same as add1. I guess similar things are possible in the other languages as well. What greater gains would there be to automatically currying every function defined on a tuple? (Since that's what you seem to be asking.)
Expanding on the comments under namin's good answer:
So assume a function which has type 'a -> ('a * 'a):
let gimme_tuple(a : int) =
(a*2, a*3)
Then assume a function which has type ('a * 'a) -> 'b:
let add(a : int, b) =
a + b
Then composition (assuming the conflation that I propose) wouldn't pose any problem as far as I can see:
let foo = add(gimme_tuple(5))
// foo gets value 5*2 + 5*3 = 25
But then you could conceive of a polymorphic function that takes the place of add in the last code snippet, for instance a little function that just takes a 2-tuple and makes a list of the two elements:
let gimme_list(a, b) =
[a, b]
This would have the type ('a * 'a) -> ('a list). Composition now would be problematic. Consider:
let bar = gimme_list(gimme_tuple(5))
Would bar now have the value [10, 15] : int list or would bar be a function of type (int * int) -> ((int * int) list), which would eventually return a list whose head would be the tuple (10, 15)? For this to work, I posited in a comment to namin's answer that one would need an additional rule in the type system that the binding of actual to formal parameters be the "fullest possible", i.e. that the system should attempt a non-partial binding first and only try a partial binding if a full binding is unattainable. In our example, it would mean that we get the value [10, 15] because a full binding is possible in this case.
Is such a concept of "fullest possible" inherently meaningless? I don't know, but I can't immediately see a reason it would be.
One problem is of course if you want the second interpretation of the last code snippet, then you'd need to go jump through an extra hoop (typically an anonymous function) to get that.
Tuples are not present in these languages simply to be used as function parameters. They are a really convenient way to represent structured data, e.g., a 2D point (int * int), a list element ('a * 'a list), or a tree node ('a * 'a tree * 'a tree). Remember also that structures are just syntactic sugar for tuples. I.e.,
type info =
{ name : string;
age : int;
address : string; }
is a convenient way of handling a (string * int * string) tuple.
There's no fundamental reason you need tuples in a programming language (you can build up tuples in the lambda calculus, just as you can booleans and integers—indeed using currying*), but they are convenient and useful.
* E.g.,
tuple a b = λf.f a b
fst x = x (λa.λb.a)
snd x = x (λa.λb.b)
curry f = λa.λb.f (λg.g a b)
uncurry f = λx.x f

Resources