In Haskell we can flatten a list of lists Flatten a list of lists
For simple cases of tuples, I can see how we would flatten certain tuples, as in the following examples:
flatten :: (a, (b, c)) -> (a, b, c)
flatten x = (fst x, fst(snd x), snd(snd x))
flatten2 :: ((a, b), c) -> (a, b, c)
flatten2 x = (fst(fst x), snd(fst x), snd x)
However, I'm after a function that accepts as input any nested tuple and which flattens that tuple.
Can such a function be created in Haskell?
If one cannot be created, why is this the case?
No, it's not really possible. There are two hurdles to clear.
The first is that all the different sizes of tuples are different type constructors. (,) and (,,) are not really related to each other at all, except in that they happen to be spelled with a similar sequence of characters. Since there are infinitely many such constructors in Haskell, having a function which did something interesting for all of them would require a typeclass with infinitely many instances. Whoops!
The second is that there are some very natural expectations we naively have about such a function, and these expectations conflict with each other. Suppose we managed to create such a function, named flatten. Any one of the following chunks of code seems very natural at first glance, if taken in isolation:
flattenA :: ((Int, Bool), Char) -> (Int, Bool, Char)
flattenA = flatten
flattenB :: ((a, b), c) -> (a, b, c)
flattenB = flatten
flattenC :: ((Int, Bool), (Char, String)) -> (Int, Bool, Char, String)
flattenC = flatten
But taken together, they seem a bit problematic: flattenB = flatten can't possibly be type-correct if both flattenA and flattenC are! Both of the input types for flattenA and flattenC unify with the input type to flattenB -- they are both pairs whose first component is itself a pair -- but flattenA and flattenC return outputs with differing numbers of components. In short, the core problem is that when we write (a, b), we don't yet know whether a or b is itself a tuple and should be "recursively" flattened.
With sufficient effort, it is possible to do enough type-level programming to put together something that sometimes works on limited-size tuples. But it is 1. a lot of up-front effort, 2. very little long-term programming efficiency payoff, and 3. even at use sites requires a fair amount of boilerplate. That's a bad combo; if there's use-site boilerplate, then you might as well just write the function you cared about in the first place, since it's generally so short to do so anyway.
Related
I'm new to Haskell, I have a question regarding tuples. Is there not a way to traverse a tuple? I understand that traversal is very easy with lists but if the input is given as a tuple is there not a way to check the entire tuple as you do with a list? If that's not the case would it possible to just extract the values from the tuple into a list and perform traversal that way?
In Haskell, it’s not considered idiomatic (nor is it really possible) to use the tuple as a general-purpose traversable container. Any tuple you deal with is going to have a fixed number of elements, with the types of these elements also being fixed. (This is quite different from how tuples are idiomatically used in, for example, Python.) You ask about a situation where “the input is given as a tuple” but if the input is going to have a flexible number of elements then it definitely won’t be given as a tuple—a list is a much more likely choice.
This makes tuples seem less flexible than in some other languages. The upside is that you can examine them using pattern matching. For example, if you want to evaluate some predicate for each element of a tuple and return True if the predicate passes for all of them, you would write something like
all2 :: (a -> Bool) -> (a, a) -> Bool
all2 predicate (x, y) = predicate x && predicate y
Or, for three-element tuples,
all3 :: (a -> Bool) -> (a, a, a) -> Bool
all3 predicate (x, y, z) = predicate x && predicate y && predicate z
You might be thinking, “Wait, you need a separate function for each tuple size?!” Yes, you do, and you can start to see why there’s not a lot of overlap between the use cases for tuples and the use cases for lists. The advantages of tuples are exactly that they are kind of inflexible: you always know how many values they contain, and what type those values have. The former is not really true for lists.
Is there not a way to traverse a tuple?
As far as I know, there’s no built-in way to do this. It would be easy enough to write down instructions for traversing a 2-tuple, traversing a 3-tuple, and so on, but this would have the big limitation that you’d only be able to deal with tuples whose elements all have the same type.
Think about the map function as a simple example. You can apply map to a list of type [a] as long as you have a function a -> b. In this case map looks at each a value in turn, passes it to the function, and assembles the list of resulting b values. But with a tuple, you might have three elements whose values are all different types. Your function for converting as to bs isn’t sufficient if the tuple consists of two a values and a c! If you try to start writing down the Foldable instance or the Traversable instance even just for two-element tuples, you quickly realize that those typeclasses aren’t designed to handle containers whose values might have different types.
Would it be possible to just extract the values from the tuple into a list?
Yes, but you would need a separate function for each possible size of the input tuple. For example,
tupleToList2 :: (a, a) -> [a]
tupleToList2 (x, y) = [x, y]
tupleToList3 :: (a, a, a) -> [a]
tupleToList3 (x, y, z) = [x, y, z]
The good news, of course, is that you’re never going to get a situation where you have to deal with tuples of arbitrary size, because that isn’t a thing that can happen in Haskell. Think about the type signature of a function that accepted a tuple of any size: how could you write that?
In any situation where you’re accepting a tuple as input, it’s probably not necessary to convert the tuple to a list first, because the pattern-matching syntax means that you can just address each element of the tuple individually—and you always know exactly how many such elements there are going to be.
If your tuple is a homogeneous tuple, and you don't mind to use the third-party package, then lens provides some functions to traverse each elements in an arbitrary tuple.
ghci> :m +Control.Lens
ghci> over each (*10) (1, 2, 3, 4, 5) --traverse each element
(10,20,30,40,50)
Control.Lens.Tuple provides some lens to get and set the nth element up to 19th.
You can explore the lens package for more information. If you want to learn the lens package, Optics by examples by Chris Penner is a good book.
I'm using esqueleto for making SQL queries, and I have one query which returns data with type (Value a, Value b, Value c). I want to extract (a, b, c) from it. I know that I can use pattern matching like that:
let (Value a, Value b, Value c) = queryResult
But I'd like to avoid repeating Value for every tuple element. This is particularly annoying when the tuple has much more elements (like 10). Is there any way to simplify this? Is there a function which I could use like that:
let (a, b, c) = someFunction queryResult
Data.Coerce from base provides coerce, which acts as your someFunction.
coerce "exchanges" newtypes for the underlying type they wrap (and visa-versa). This works even if they are wrapped deeply within other types. This is also done with zero overhead, since newtypes have the exact same runtime representation as the type they wrap.
There is a little bit more complexity with type variable roles that you can read about on the Wiki page if you're interested, but an application like this turns out to be straightforward since the package uses the "default" role for Value's type variable argument.
The library appears to have an unValue function, so you just need to choose a way to map over arbitrary length tuples. Then someFunction can become
import Control.Lens (over, each)
someFunction = (over each) unValue
If you want to try some other ways to map tuples without a lens dependency, you could check out this question: Haskell: how to map a tuple?
edit: As danidiaz points out this only works for tuples which are max 8 fields long. I'm not sure if there's a better way to generalise it.
If your tuple has all the same element type:
all3 :: (a -> b) -> (a, a, a) -> (b, b, b)
all3 f (x, y, z) = (f x, f y, f z)
This case can be abstracted over with lenses, using over each as described in #Zpalmtree’s answer.
But if your tuple has different element types, you can make the f argument of this function polymorphic using the RankNTypes extension:
all3 :: (forall a. c a -> a) -> (c x, c y, c z) -> (x, y, z)
all3 f (x, y, z) = (f x, f y, f z)
Then assuming you have unValue :: Value a -> a, you can write:
(a, b, c) = all3 unValue queryResult
However, you would need to write separate functions all4, all5, …, all10 if you have large tuples. In that case you could cut down on the boilerplate by generating them with Template Haskell. This is part of the reason that large tuples are generally avoided in Haskell, since they’re awkward to work with and can’t be easily abstracted over.
I've lately been working on an API in Elm where one of the main types is contravariant. So, I've googled around to see what one can do with contravariant types and found that the Contravariant package in Haskell defines the Divisible type class.
It is defined as follows:
class Contravariant f => Divisible f where
divide :: (a -> (b, c)) -> f b -> f c -> f a
conquer :: f a
It turns out that my particular type does suit the definition of the Divisible type class. While Elm does not support type classes, I do look at Haskell from time to time for some inspiration.
My question: Are there any practical uses for this type class? Are there known APIs out there in Haskell (or other languages) that benefit from this divide-conquer pattern? Are there any gotchas I should be aware of?
Thank you very much for your help.
One example:
Applicative is useful for parsing, because you can turn Applicative parsers of parts into a parser of wholes, needing only a pure function for combining the parts into a whole.
Divisible is useful for serializing (should we call this coparsing now?), because you can turn Divisible serializers of parts into a serializer of wholes, needing only a pure function for splitting the whole into parts.
I haven't actually seen a project that worked this way, but I'm (slowly) working on an Avro implementation for Haskell that does.
When I first came across Divisible I wanted it for divide, and had no idea what possible use conquer could be other than cheating (an f a out of nowhere, for any a?). But to make the Divisible laws check out for my serializers conquer became a "serializer" that encodes anything to zero bytes, which makes a lot of sense.
Here's a possible use case.
In streaming libraries, one can have fold-like constructs like the ones from the foldl package, that are fed a sequence of inputs and return a summary value when the sequence is exhausted.
These folds are contravariant on their inputs, and can be made Divisible. This means that if you have a stream of elements where each element can be somehow decomposed into b and c parts, and you also happen to have a fold that consumes bs and another fold that consumes cs, then you can build a fold that consumes the original stream.
The actual folds from foldl don't implement Divisible, but they could, using a newtype wrapper. In my process-streaming package I have a fold-like type that does implement Divisible.
divide requires the return values of the constituent folds to be of the same type, and that type must be an instance of Monoid. If the folds return different, unrelated monoids, a workaround is to put each return value in a separate field of a tuple, leaving the other field as mempty. This works because a tuple of monoids is itself a Monoid.
I'll examine the example of the core data types in Fritz Henglein's generalized radix sort techniques as implemented by Edward Kmett in the discrimination package.
While there's a great deal going on there, it largely focuses around a type like this
data Group a = Group (forall b . [(a, b)] -> [[b]])
If you have a value of type Group a you essentially must have an equivalence relationship on a because if I give you an association between as and some type b completely unknown to you then you can give me "groupings" of b.
groupId :: Group a -> [a] -> [[a]]
groupId (Group grouper) = grouper . map (\a -> (a, a))
You can see this as a core type for writing a utility library of groupings. For instance, we might want to know that if we can Group a and Group b then we can Group (a, b) (more on this in a second). Henglein's core idea is that if you can start with some basic Groups on integers—we can write very fast Group Int32 implementations via radix sort—and then use combinators to extend them over all types then you will have generalized radix sort to algebraic data types.
So how might we build our combinator library?
Well, f :: Group a -> Group b -> Group (a, b) is pretty important in that it lets us make groups of product-like types. Normally, we'd get this from Applicative and liftA2 but Group, you'll notice, is Contravaiant, not a Functor.
So instead we use Divisible
divided :: Group a -> Group b -> Group (a, b)
Notice that this arises in a strange way from
divide :: (a -> (b, c)) -> Group b -> Group c -> Group a
as it has the typical "reversed arrow" character of contravariant things. We can now understand things like divide and conquer in terms of their interpretation on Group.
Divide says that if I want to build a strategy for equating as using strategies for equating bs and cs, I can do the following for any type x
Take your partial relation [(a, x)] and map over it with a function f :: a -> (b, c), and a little tuple manipulation, to get a new relation [(b, (c, x))].
Use my Group b to discriminate [(b, (c, x))] into [[(c, x)]]
Use my Group c to discriminate each [(c, x)] into [[x]] giving me [[[x]]]
Flatten the inner layers to get [[x]] like we need
instance Divisible Group where
conquer = Group $ return . fmap snd
divide k (Group l) (Group r) = Group $ \xs ->
-- a bit more cleverly done here...
l [ (b, (c, d)) | (a,d) <- xs, let (b, c) = k a] >>= r
We also get interpretations of the more tricky Decidable refinement of Divisible
class Divisible f => Decidable f where
lose :: (a -> Void) -> f a
choose :: (a -> Either b c) -> f b -> f c -> f a
instance Decidable Group where
lose :: (a -> Void) -> Group a
choose :: (a -> Either b c) -> Group b -> Group c -> Group a
These read as saying that for any type a of which we can guarantee there are no values (we cannot produce values of Void by any means, a function a -> Void is a means of producing Void given a, thus we must not be able to produce values of a by any means either!) then we immediately get a grouping of zero values
lose _ = Group (\_ -> [])
We also can go a similar game as to divide above except instead of sequencing our use of the input discriminators, we alternate.
Using these techniques we build up a library of "Groupable" things, namely Grouping
class Grouping a where
grouping :: Group a
and note that nearly all the definitions arise from the basic definition atop groupingNat which uses fast monadic vector manipuations to achieve an efficient radix sort.
I know that there are predefined Eq instances for tuples of lengths 2 to 15.
Why aren't tuples defined as some kind of recursive datatype such that they can be decomposed, allowing a definition of a function for a compare that works with arbitrary length tuples?
After all, the compiler does support arbitrary length tuples.
You might ask yourself what the type of that generalized comparison function would be. First of all we need a way to encode the component types:
data Tuple ??? = Nil | Cons a (Tuple ???)
There is really nothing valid we can replace the question marks with. The conclusion is that a regular ADT is not sufficient, so we need our first language extension, GADTs:
data Tuple :: ??? -> * where
Nil :: Tuple ???
Cons :: a -> Tuple ??? -> Tuple ???
Yet we end up with question marks. Filling in the holes requires another two extensions, DataKinds and TypeOperators:
data Tuple :: [*] -> * where
Nil :: Tuple '[]
Cons :: a -> Tuple as -> Tuple (a ': as)
As you see we needed three type system extensions just to encode the type. Can we compare now? Well, it's not that straightforward to answer, because it's actually far from obvious how to write a standalone comparison function. Luckily the type class mechanism allows us to take a simple recursive approach. However, this time we are not just recursing on the value level, but also on the type level. Obviously empty tuples are always equal:
instance Eq (Tuple '[]) where
_ == _ = True
But the compiler complains again. Why? We need another extension, FlexibleInstances, because '[] is a concrete type. Now we can compare empty tuples, which isn't that compelling. What about non-empty tuples? We need to compare the heads as well as the rest of the tuple:
instance (Eq a, Eq (Tuple as)) => Eq (Tuple (a ': as)) where
Cons x xs == Cons y ys = x == y && xs == ys
Seems to make sense, but boom! We get another complaint. Now the compiler wants FlexibleContexts, because we have a not-fully-polymorphic type in the context, Tuple as.
That's a total of five type system extensions, three of them just to express the tuple type, and they didn't exist before GHC 7.4. The other two are needed for comparison. Of course there is a payoff. We get a very powerful tuple type, but because of all those extensions, we obviously can't put such a tuple type into the base library.
You can always rewrite any n-tuple in terms of binary tuples. For example, given the following 4-tuple:
(1, 'A', "Hello", 20)
You can rewrite it as:
(1, ('A', ("Hello", (20, ()))))
Think of it as a list, where (,) plays the role of (:) (i.e. "cons") and () plays the role of [] (i.e. "nil"). Using this trick, as long as you formulate your n-tuple in terms of a "list of binary tuples", then you can expand it indefinitely and it will automatically derive the correct Eq and Ord instances.
A type of compare is a -> a -> Ordering, which suggests that both of the inputs must be of the same type. Tuples of different arities are by definition different types.
You can however solve your problem by approaching it either with HLists or GADTs.
I just wanted to add to ertes' answer that you don't need a single extension to do this. The following code should be haskell98 as well as 2010 compliant. And the datatypes therein can be mapped one on one to tuples with the exception of the singleton tuple. If you do the recursion after the two-tuple you could also achieve that.
module Tuple (
TupleClass,
TupleCons(..),
TupleNull(..)
) where
class (TupleClassInternal t) => TupleClass t
class TupleClassInternal t
instance TupleClassInternal ()
instance TupleClassInternal (TupleCons a b)
data (TupleClassInternal b) => TupleCons a b = TupleCons a !b deriving (Show)
instance (Eq a, Eq b, TupleClass b) => Eq (TupleCons a b) where
(TupleCons a1 b1) == (TupleCons a2 b2) = a1 == a2 && b1 == b2
You could also just derive Eq. Of course it would look a bit cooler with TypeOperators but haskell's list system has syntactical sugar too.
A common idiom in Haskell, difference lists, is to represent a list xs as the value (xs ++). Then (.) becomes "(++)" and id becomes "[]" (in fact this works for any monoid or category). Since we can compose functions in constant time, this gives us a nice way to efficiently build up lists by appending.
Unfortunately the type [a] -> [a] is way bigger than the type of functions of the form (xs ++) -- most functions on lists do something other than prepend to their argument.
One approach around this (as used in dlist) is to make a special DList type with a smart constructor. Another approach (as used in ShowS) is to not enforce the constraint anywhere and hope for the best. But is there a nice way of keeping all the desired properties of difference lists while using a type that's exactly the right size?
Yes!
We can view [a] as a free monad instance Free ((,) a) ().
Thus we can apply the scheme described by Edward Kmett in Free Monads for Less.
The type we'll get is
newtype F a = F { runF :: forall r. (() -> r) -> ((a, r) -> r) -> r }
or simply
newtype F a = F { runF :: forall r. r -> (a -> r -> r) -> r }
So runF is nothing else than the foldr function for our list!
This is called the Boehm-Berarducci encoding and it's isomorphic to the original data type (list) — so this is as small as you can possibly get.
Will Ness says:
So this type is still too "wide", it allows more than just prefixing - doesn't constrain the g function argument.
If I understood his argument correctly, he points out that you can apply the foldr (or runF) function to something different from [] and (:).
But I never claimed that you can use foldr-encoding only for concatenation. Indeed, as this name implies, you can use it to calculate any fold — and that's what Will Ness demonstrated.
It may become more clear if you forget for a moment that there's one true list type, [a]. There may be lots of list types — e.g. I can define one by
data List a = Nil | Cons a (List a)
It's be different from, but isomorphic to [a].
The foldr-encoding above is just yet another encoding of lists, like List a or [a]. It is also isomorphic to [a], as evidenced by functions \l -> F (\a f -> foldr a f l) and \x -> runF [] (:) and the fact that their compositions (in either order) is identity. But you are not obliged to convert to [a] — you can convert to List directly, using \x -> runF x Nil Cons.
The important point is that F a doesn't contain an element that is not the foldr functions for some list — nor does it contain an element that is the foldr functions for more than one list (obviously).
Thus, it doesn't contain too few or too many elements — it contains precisely as many elements as needed to exactly represent all lists.
This is not true of the difference list encoding — for example, the reverse function is not an append operation for any list.