How can I get the union of an arbitrary number of lists in Haskell. For example, I would like a function that behaves like the one below:
example1 = union' [1,2,3] [1,4]
example2 = union' [1,2,3] [1,4] [2,6]
example1
[1,2,3,4]
example2
[1,2,3,4,6]
A function in Haskell only takes one argument. A "two"-argument function is really a function that returns another function that returns the ultimate return value. As such, there is no way for a function to take a variable number of arguments, because the return type of such a function wouldn't be well defined.
If you want to take the union of an arbitrary number of lists, your function should take a list of lists, since a list can contain an arbitrary number of elements.
union' :: Eq a => [[a]] -> [a]
union' = foldr unionOfTwo []
where unionOfTwo :: Eq a => [a] -> [a] -> [a]
unionOfTwo xs ys = ... -- left as an exercise
where unionOfTwo knows how to compute the union of exactly two lists. Effectively, union' sets aside the first list in the input, recursively computes the union of the remaining inputs, then computes the union of that result and the original first list. Put another way,
union' [] = []
union' (xs:xss) = unionOfTwo xs (union' xss)
First a working code example:
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances #-}
module Main where
import Data.List (union)
class Unionable a t where
union' :: [a] -> t
instance Unionable a [a] where
union' = id
instance (Eq a, Unionable a t) => Unionable a ([a] -> t) where
union' xs ys = union' (union xs ys)
main = do
print $ (union' [1::Integer,2,3] [1::Integer,5,6] [1::Integer,7,3] :: [Integer])
mimiced from here.
You probably want to use such a function with literals and sadly, as you can see here, it will not be convienent to use it with polymorphic literals, as you will need to specify the type of every argument.
In other contexts, the types of the arguments have to be clear and the expected type of the result must be clear too, otherwise, you will need to add such type annotations.
For normal code it probably isn't worth the effort.
Let's explain what happens here, the compiler sees:
(union' [1::Integer,2,3] [1::Integer,5,6] [1::Integer,7,3] :: [Integer])
and it thinks, we need
union' :: [Integer] -> [Integer] -> [Integer] -> [Integer]
do we have such a union'? A candidate for that would be provided by the second instance declaration
a ~ Integer
t ~ [Integer] -> [Integer] -> [Integer]
but for that instance to be applicable, we need an instance of (Unionable a t) with those assignments. Do we have such an instance? Again the second instance declaration is a candidate, this time with
a ~ Integer
t ~ [Integer] -> [Integer]
but for that instance to be applicable, we need an instance of (Unionable a t) with those assignments. Do we have such an instance? Again the second instance declaration is a candidate, this time with
a ~ Integer
t ~ [Integer]
This time, we get such an instance from the first instance declaration with
no additional constraints needed.
This means (ommitting the type annotations for clarity)
union' [1,2,3] [1,5,6] [1,7,3]
= unions' (union [1,2,3] [1,5,6]) [1,7,3]
= unions' (union (union [1,2,3] [1,5,6]) [1,7,3])
= id (union (union [1,2,3] [1,5,6]) [1,7,3])
= (union (union [1,2,3] [1,5,6]) [1,7,3])
= [1,2,3,5,6,7]
Related
So I'm new to Haskell and learning it using WikiBooks. And in the higher order functions chapter, there is the following example used.
echoes = foldr (\ x xs -> (replicate x x) ++ xs) []
So I tried running it, but it gives me an error as follows :
* Ambiguous type variable `t0' arising from a use of `foldr'
prevents the constraint `(Foldable t0)' from being solved.
Relevant bindings include
echoes :: t0 Int -> [Int] (bound at HavingFun.hs:107:1)
Probable fix: use a type annotation to specify what `t0' should be.
These potential instances exist:
instance Foldable (Either a) -- Defined in `Data.Foldable'
instance Foldable Maybe -- Defined in `Data.Foldable'
instance Foldable ((,) a) -- Defined in `Data.Foldable'
...plus one other
...plus 29 instances involving out-of-scope types
(use -fprint-potential-instances to see them all)
* In the expression: foldr (\ x xs -> (replicate x x) ++ xs) []
In an equation for `echoes':
echoes = foldr (\ x xs -> (replicate x x) ++ xs) []
And then if I write it as follows, it works.
echoes lis = foldr (\ x xs -> (replicate x x) ++ xs) [] lis
I am confused about this and I think this is somehow related point free definitions of functions ?
Please clarify what the problem is there here.
The link from where I'm learning - https://en.wikibooks.org/wiki/Haskell/Lists_III
tl;dr
just always write explicit type signatures, then you're safe(r) from weird problems like that.
The reason this used to work but now doesn't is that foldr formerly had the signature
foldr :: (a -> b -> b) -> b -> [a] -> b
which is what the WikiBooks assumes, but in newer GHC it actually has the strictly more general signature
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
The old version is a special case of this, by simply choosing t ~ []. The reason they changed it is that you can also fold over other containers, such as arrays or maps. In fact, in your code
echoes = foldr (\ x xs -> (replicate x x) ++ xs) []
there's nothing the requires the input container to be a list, either, so it would in fact work perfectly well with the signature
echoes :: Foldable t => t Int -> [Int]
...of which, again, [Int] -> [Int] is a special case, so that function could then be used as
> echoes [1,2,3]
[1,2,2,3,3,3]
but also as
> echoes $ Data.Map.fromList [('a',2), ('c',5), ('b',1)]
[2,2,1,5,5,5,5,5]
Or you could have given the function the list-specific signature
echoes' :: [Int] -> [Int]
echoes' = foldr (\x xs -> (replicate x x) ++ xs) []
That works just the same on [1,2,3] but can't accept a Map.
The question is now, why does GHC not infer either of those signatures by itself? Well, if it had to choose one, it should be the more general Foldable version, because people might need to use this with other containers and wouldn't want to keep repeating the Foldable t => quantifier. However, this contradicts another Haskell rule, the monomorphism restriction. Because your echoes implementation doesn't explicitly accept any parameters (it only does that point-freely), it is a constant applicative form, and a standalone CAF is supposed to have monomorphic type unless explicitly specified to be polymorphic. Thus the error message you ran into: GHC really wants this to be monomorphic, but it has no information that restricts what concrete Foldable container to pick.
There are four ways around this:
As you noticed, by bringing the argument explicitly in scope, echoes is not a CAF anymore and therefore GHC infers the polymorphic type:
echoes'' l = foldr (\x xs -> (replicate x x) ++ xs) [] l
> :t echoes''echoes'' :: Foldable t => t Int -> [Int]
By disabling the monomorphism restriction, GHC won't care anymore whether it's CAF and just give it the more general type regardless:
{-# LANGUAGE NoMonomorphismRestriction #-}
echoes''' = foldr (\x xs -> (replicate x x) ++ xs) []
> :t echoes'''echoes''' :: Foldable t => t Int -> [Int]
Discouraged If you turn on the -XExtendedDefaultingRules extension, GHC will automatically choose [] as the concrete monomorphic container for the CAF:
{-# LANGUAGE ExtendedDefaultRules #-}
echoes'''' = foldr (\x xs -> (replicate x x) ++ xs) []
> :t echoes''''echoes'''' :: [Int] -> [Int]
GHCi has -XExtendedDefaultingRules enabled by default, so that's what also happens if you just declare the function in the GHCi prompt.
Strongly recommended If you explicitly specify the signature, you and GHC both know exactly what's intended and behave accordingly, without requiring any special GHC extensions.
echoes :: Foldable t => t Int -> [Int]
echoes = foldr (\x xs -> (replicate x x) ++ xs) []
echoes' :: [Int] -> [Int]
echoes' = foldr (\x xs -> (replicate x x) ++ xs) []
> :t echoesechoes :: Foldable t => t Int -> [Int]> :t echoes'echoes' :: [Int] -> [Int]
Consider this Vect type:
{-# LANGUAGE GADTs, DataKinds, KindSignatures, RankNTypes #-}
import Data.Kind (Type)
data Nat = Zero | Succ Nat
data Vect :: Nat -> Type -> Type where
Nil :: Vect 'Zero a
Cons :: a -> Vect n a -> Vect ('Succ n) a
I'd like to write a function that takes a list and lazily transforms it to this type. Since the length obviously isn't known, we need to use an existential. My first attempt at this used CPS and a rank-2 type to emulate one:
listToVect1 :: [a] -> (forall n. Vect n a -> b) -> b
listToVect1 [] f = f Nil
listToVect1 (x:xs) f = listToVect1 xs (f . Cons x)
However, this isn't lazy. This is just a foldl (though written as explicit recursion because actually using foldl would require impredicativity), so f won't start running until it's gone all the way to the end of the list.
I can't think of a way to do CPS in a right-associative manner, so I tried again, this time with a wrapper type instead:
data ArbitraryVect a = forall n. ArbitraryVect (Vect n a)
nil :: ArbitraryVect a
nil = ArbitraryVect Nil
cons :: a -> ArbitraryVect a -> ArbitraryVect a
cons x (ArbitraryVect xs) = ArbitraryVect (Cons x xs)
listToVect2 :: [a] -> ArbitraryVect a
listToVect2 = foldr cons nil
The problem with this attempt is that I have to use data rather than newtype (or I get A newtype constructor cannot have existential type variables), and I need to pattern-match strictly in cons (or I get An existential or GADT data constructor cannot be used inside a lazy (~) pattern), so it doesn't return anything until it's through the entire list.
I don't have any other ideas for how to do this. Is this even possible? If so, how? If not, why not?
This is going to be an unsatisfying answer, but I'm going to post it anyway.
I don't think this is technically possible, because the head of the result should contain the type n, and types cannot be lazily constructed. They're sort of "ephemeral", don't really exist at runtime as such, but logically they have to be fully known.
(I am not 100% sure the above is correct though)
However, if you really need to do this, you can cheat. Observe that the polymorphic continuation doesn't actually know anything about n. The only thing it can possibly find out about n is whether it's a Zero or Succ, but even that is not actually encoded at runtime in any way. Instead, it gets inferred from whichever constructor of Vect matches during pattern match. This means that, at runtime, we don't actually have to pass in the correct type for n. Sure, the compiler will get anxious during compilation if it can't prove the type is right, but we can convince it to shut up with unsafeCoerce:
listToVect1 :: [a] -> (forall n. Vect n a -> b) -> b
listToVect1 xs f = f $ go xs
where
go :: [a] -> Vect Zero a
go [] = Nil
go (x:xs) = unsafeCoerce $ Cons x (go xs)
Here, the second line of go constructs a Vect (Succ Zero) a, but then erases its n component to Zero. This happens at every step, so the ultimate result is always Vect Zero a. This result is then passed to the continuation, which is none the wiser, because it doesn't care.
When the continuation later tries to match on Vect's constructors, it works fine, because the constructors have been instantiated correctly in the right order, reflecting the correct shape of the vector, and by extension, the correct shape of n.
This works, try it:
vectLen :: Vect n a -> Int
vectLen Nil = 0
vectLen (Cons _ xs) = 1 + vectLen xs
toList :: Vect n a -> [a]
toList Nil = []
toList (Cons a xs) = a : toList xs
main :: IO ()
main = do
print $ listToVect1 [1,2,3] vectLen -- prints 3
print $ listToVect1 [] vectLen -- prints 0
print $ listToVect1 [1,2,3,4,5] vectLen -- prints 5
print $ listToVect1 [1,2,3] toList -- prints [1,2,3]
print $ listToVect1 ([] :: [Int]) toList -- prints []
print $ listToVect1 [1,2,3,4,5] toList -- prints [1,2,3,5]
Of course, the above is fragile. It relies (to an extent) on some lower-level knowledge. If this is not just a curiosity exercise, I would rather go back and rethink the original problem that lead you to this.
But for what it's worth, this technique of "hiding ugliness by hand-waving" is in a relatively common use in lower-level libraries.
Apologies for the vague title, here's some context: http://themonadreader.files.wordpress.com/2013/08/issue221.pdf
The GADTs article in the above issue introduces a Nat type, and a NatSing type for use in various type-level list functions (concat, !!, head, repeat, etc). For a couple of these functions it's necessary to create type families for defining + and < on the Nat type.
data Nat = Zero | Succ Nat
data NatSing (n :: Nat) where
ZeroSing :: NatSing Zero
SuccSing :: NatSing n -> NatSing (Succ n)
data List (n :: Nat) a where
Nil :: List n a
Cons :: a -> List n a -> List (Succ n) a
Anyways, I have written a function "list" that converts an ordinary [a] into a List n a, for convenience to the caller. This requires the length of the list as input, much like repeat (from the linked article):
list :: [a] -> NatSing n -> List n a
list [] ZeroSing = Nil
list (x:xs) (SuccSing n) = Cons x (list xs n)
list _ _ = error "length mismatch"
It would be nice to utilize a helper function toNatSing :: Int -> NatSing n, so the above can be written as
list :: [a] -> List n a
list xs = list' xs (toNatSing $ length xs)
where
list' :: [a] -> NatSing n -> List n a
list' = ... -- same as before, but this time I simply "know" the function is no longer partial
Is it possible to write such a function toNatSing? I've been wrestling with types and haven't come up with anything yet.
Thanks a lot!
No, you cannot write such a function.
A function of type
Int -> NatSing n
says that it can transform any integer into a polymorphic NatSing. But there is no polymorphic NatSing.
What you seem to want here is to have n determined by the incoming Int. That'd be a dependent type:
(n :: Int) -> NatSing n
Such a thing isn't possible in Haskell. You'd have to use Agda or Idris or another dependently typed language. The hack with singletons is exactly Haskell's way to get around this. If you want to make a distinction based on a value, you have to lift the value to the type level, which is what NatSing is.
You could write a function that returns a NatSing for some n, by wrapping the n up in an existential type:
data ExNatSing where
ExNatSing :: NatSing n -> ExNatSing
But this wouldn't give you much benefit in practice. By wrapping the n up, you lose all information about it, and cannot make decisions based on it later.
By exactly the same argument, you can also not hope to define a function
list :: [a] -> List n a
The only approach you can use to save yourself some typing work is to define a type class that construct the NatSing value automatically:
class CNatSing (n :: Nat) where
natSing :: NatSing n
instance CNatSing Zero where
natSing = ZeroSing
instance CNatSing n => CNatSing (Succ n) where
natSing = SuccSing natSing
Then, you can say:
list :: CNatSing n => [a] -> List n a
list xs = list' xs natSing
where
list' :: [a] -> NatSing n -> List n a
list' = ... -- same as before, but still partial
Here, the type context you use this in makes GHC fill in the right NatSing. However, this function is still partial, because still the caller of the function can choose at what n to use this. If I want to use a [Int] of length 3 as a List (Succ Zero) Int it's going to crash.
Again, you could wrap this up in an existential instead:
data SomeList a where
SomeList :: NatSing n -> List n a -> SomeList a
Then you could write
list :: [a] -> SomeList a
list [] = SomeList ZeroSing Nil
list (x : xs) = case list xs of
SomeList n xs -> SomeList (SuccSing n) (x : xs')
Again, the benefit is small, but in contrast to ExNatSing, there at least is one: you now can temporarily unwrap a SomeList and pass it to functions that operate on a List n a, getting type-system guarantees on how the length of the list is transformed by these functions.
What you want looks something like (Int -> (exists n . NatSing n)), where n is unknown ahead of time. You could do that with something like (untested):
data AnyNatSing where
AnyNatSing :: NatSing n -> AnyNatSing
toNatSing :: Int -> AnyNatSing
toNatSing n
| n > 0 = case toNatSing (n - 1) of
AnyNatSing n -> AnyNatSing (SuccSing n)
| otherwise = AnyNatSing ZeroSing
As said by Louis Wassermann, the closest match to what you want is an existential wrapper that makes NatSing monomorphic from the outside.
I consider that pretty useless though, since it basically just throws away the type-checking of lengths and you're left with a dumb standard integer type. There are easier ways to to that, like, using a dumb standard integer type and ordinary Haskell lists...
But there's one alternative that's perhaps not quite so useless. Remember that it's pretty much equivalent if you return some value x from a function, or instead pass a higher-order and call that with x; I think lisps particularly like such continuation-passing tricks.
For your example, you need a function that's able to accept any length-type of lists. Well, such functions certainly exist, e.g. a scalar product that requires two lists of equal length, but doesn't care what the length is. And it then returns a simple monomorphic number. For simplicity's sake, let's consider an even simpler sum on your list type:
sumSing :: Num a => List (n::Nat) a -> a
sumSing Nil = 0
sumSing (Cons x ls) = x + sumSing ls
Then you can do:
{-# LANGUAGE RankNTypes #-}
onList :: (forall n . CNatSing n => List n a -> b) -> [a] -> b
f`onList`l = f (list l)
(list being kosmicus' variant with the CNatSing class constraint) and call it like
sumSing `onList` [1,2,3]
...which is of course in itself no more useful than the existential solution (which, I think, actually desugars to something similar to this RankN stuff). But you can do more here, like the scalar product example – providing two lists and actually ensuring through the type system they have the same length. This would be far uglier with existentials: you'd basically need a seperate TwoEqualLenLists type.
I've just started programming in Haskell, and I am solving 99 Haskell problems, and when I was nearly done with 10th, I've encountered this problem:
-- Exercise 9
pack :: Eq a => [a] -> [[a]]
pack [] = []
pack list = let (left,right) = span (== head list) list in
left : pack right
-- Exercise 10
encode :: (Eq a, Integral c) => [a] -> [(c, a)]
encode [] = []
encode list = map (\x -> (length x, head x)) (pack list)
-- this doesn't work ^^^^^^^^
The error produced told me that
Could not deduce (c ~ Int)
from the context (Eq a, Integral c)
bound by the type signature for
encode :: (Eq a, Integral c) => [a] -> [(c, a)]
at C:\fakepath\ex.hs:6:11-47
`c' is a rigid type variable bound by
the type signature for
encode :: (Eq a, Integral c) => [a] -> [(c, a)]
at C:\fakepath\ex.hs:6:11
In the return type of a call of `length'
In the expression: length x
In the expression: (length x, head x)
I've managed to fix that by inserting a function I've read about in Learn you a Haskell: fromIntegral.
encode list = map (\x -> (fromIntegral $ length x, head x)) (pack list)
So, my question is, why is that needed?
I've run :t length and got [a] -> Int, which is a pretty defined type for me, which should satisfy Integral c constraint.
The type signature (Eq a, Integral c) => [a] -> [(c, a)] means the function works for any types a and c in the appropriate typeclasses. The actual type used is specified at the call site.
As a simple example, let's take a look at the type of the empty list:
:t []
[a]
What this means is that [] represents an empty list of String, and empty list of Int, an empty list of Maybe [Maybe Bool] and whatever other types you can imagine. We can imagine wrapping this in a normal identifier:
empty :: [a]
empty = []
empty obviously works the same way as []. So you can see that the following definition would make no sense:
empty :: [a]
empty = [True]
after all, [True] can never be a [Int] or [String] or whatever other empty list you want.
The idea here is the same, except we have typeclass constraints on the variables as well. For example, you can use encode to return a [(Integer, String)] list because Integer is also in the Integral class.
So you have to return something polymorphic that could be any Integral--just what fromIntegral does. If you just returned Int, encode would only be usable as an Int and not any Integral.
I’m trying to learn Haskell and I was trying to create a function that takes a list of lists and groups the sublist by equivalent sums. This is not homework.
import Data.List
let x = [[1,2],[2,1],[5,0],[0,3],[1,9]]
let groups = groupBy (\i j -> sum i == sum j) x
I get this output in GHCi:
[[[1,2],[2,1]],[[5,0]],[[0,3]],[[1,9]]]
I get [[1,2],[2,1]] grouping together, but not with [0,3]. Why is this?
I suspect I need to use map, but I can’t seem to make it work.
The groupBy function preserves the input order and is thus invertible. If you’re willing to throw away that information, you could use code along the lines of
import Data.List (foldl')
import Data.Map (elems,empty,insertWith')
bucketBy :: Ord b => (a -> b) -> [a] -> [[a]]
bucketBy eq = elems . foldl' go empty
where go m l = insertWith' (++) (eq l) [l] m
In action:
*Main> bucketBy sum x
[[[0,3],[2,1],[1,2]],[[5,0]],[[1,9]]]
How it works
The application of elems from Data.Map gives a clue for what’s happening.
elems :: Map κ α -> [α]
O(n). Return all elements of the map in the ascending order of their keys.
elems (fromList [(5,"a"), (3,"b")]) == ["b","a"]
elems empty == []
Mapping
A Map associates values of some type κ with values of another possibly distinct type α. In the example from your question, you start with x whose type is
*Main> :type x
x :: [[Integer]]
That is, x is a list of integer lists. The type of the resulting partition of x you want is
*Main> :t [[[0,3],[2,1],[1,2]],[[5,0]],[[1,9]]]
[[[0,3],[2,1],[1,2]],[[5,0]],[[1,9]]] :: Num τ => [[[τ]]]
or a list of lists where each of the latter lists are themselves lists that all have the same sum. The Num τ => bit is a context that constrains the type τ to be an instance of the typeclass Num. Happy for us, Integer is such a type:
*Main> :info Integer
data Integer
…
instance Num Integer -- Defined in GHC.Num
…
We know then that the type of the partition is [[[Integer]]]. This typeclass nonsense may seem unnecessarily fussy, but we’ll need the concept again in just a moment. (To give you an idea of what’s going on, the typechecker doesn’t have enough information to decide whether the literal 0, for example, is of type Int or Integer.)
Each sublist contains lists with the same sum. In other words, there exists a mapping from a sum to a list of integer lists. Therefore, the type of the Map used in bucketBy must resemble
Map Integer [[Integer]]
For example, with the sum 3 we associate the list
[ [0,3]
, [2,1]
, [1,2]
]
The fold recursion pattern
Folding is a highly general pattern. Left fold, foldl and friends in Haskell lets you “insert” an operator between elements of a list beginning with the zero value at the left end of the list. For example, the sum of [5,3,9,1] expressed as a left fold is
((((0 + 5) + 3) + 9) + 1)
or
foldl (+) 0 [5,3,9,1]
That is, beginning with a base value of zero, we successively add elements of the list and accumulate the sum.
Recall the definition of bucketBy contains
elems . foldl' go empty
This means the result of the left fold must be of type Map Integer [[Integer]], the zero value for our fold is the empty Map of that type, and go is somehow adding each successive value of a list into the map.
Note that foldl' is the strict cousin of foldl, but strictness is beyond the scope of this answer. (See also “Stack overflow” on HaskellWiki.)
Dude, where’s my list?
Given the type of foldl'
*Main> :t foldl'
foldl' :: (a -> b -> a) -> a -> [b] -> a
we should have three arguments in the application, but only two are present in the code above. This is because the code is written in point-free style. Your list is there implicitly due to partial application of foldl'.
Think back to the sum-as-fold example above. The type of that application without the final argument is
*Main> :t foldl (+) 0
foldl (+) 0 :: Num b => [b] -> b
Partial application allows us to create new functions. Here we defined a function that computes a number from some list of numbers. Hmm, sounds familiar.
*Main> :t sum
sum :: Num a => [a] -> a
The . combinator expresses function composition. Its name is chosen to resemble the notation g∘f as commonly seen in mathematics textbooks to mean “do f first and then compute g from the result.” This is exactly what’s happening in the definition of bucketBy: fold the list of values into a Map and then get the values of out the Map.
If ya gotta go, go with a smile
In your comment, you asked about the purpose of m. With an explicit type annotation, we might define go as
...
where go :: Map Integer [[Integer]] -> [Integer] -> Map Integer [[Integer]]
go m l = insertWith' (++) (eq l) [l] m
Matching variables with types, m is the Map we’ve accumulated so far, and l is the next Integer list that we want to toss into the appropriate bucket. Recall that eq is an argument to the outer bucketBy.
We can control how a new item goes into the map using insertWith'. (By convention, functions whose names end with trailing quotes are strict variants.)
The (++) combinator appends lists. The application eq l determines the appropriate bucket for l.
Had we written l rather than [l], the result would want to be
*Main> bucketBy sum x
[[0,3,2,1,1,2],[5,0],[1,9]]
but then we lose the structure of the innermost lists.
We’ve already constrained the type of bucketBy's result to be [[[α]]] and thus the type of the Map's elements. Say the next item l to fold is [1,2]. We want to append, (++), it to some other list of type [[Integer]], but the types don’t match.
*Main> [[0,3],[2,1]] ++ [1,2]
<interactive>:1:21:
No instance for (Num [t0])
arising from the literal `2'
Possible fix: add an instance declaration for (Num [t0])
In the expression: 2
In the second argument of `(++)', namely `[1, 2]'
In the expression: [[0, 3], [2, 1]] ++ [1, 2]
Wrapping l gets us
*Main> [[0,3],[2,1]] ++ [[1,2]]
[[0,3],[2,1],[1,2]]
Generalizing further
You might stop with
bucketBy :: ([Integer] -> Integer) -> [[Integer]] -> [[[Integer]]]
bucketBy eq = elems . foldl' go empty
where go m l = insertWith' (++) (eq l) [l] m
or even
bucketBy :: ([Integer] -> Integer) -> [[Integer]] -> [[[Integer]]]
bucketBy eq = elems . foldl' go empty
where go :: Map Integer [[Integer]] -> [Integer] -> Map Integer [[Integer]]
go m l = insertWith' (++) (eq l) [l] m
and be perfectly happy because it handles the case from your question.
Suppose down the road you have a different list y defined as
y :: [[Int]]
y = [[1,2],[2,1],[5,0],[0,3],[1,9]]
Even though the definition is very nearly identical to x, bucketBy is of no use with y.
*Main> bucketBy sum y
<interactive>:1:15:
Couldn't match expected type `Integer' with actual type `Int'
Expected type: [[Integer]]
Actual type: [[Int]]
In the second argument of `bucketBy', namely `y'
In the expression: bucketBy sum y
Let’s assume you can’t change the type of y for some reason. You might copy-and-paste to create another function, say bucketByInt, where the only change is replacing Integer with Int in the type annotations.
This would be highly, highly unsatisfying.
Maybe later you have some list of strings that you want to bucket according to the length of the longest string in each. In this imaginary paradise you could
*Main> bucketBy (maximum . map length) [["a","bc"],["d"],["ef","g"],["hijk"]]
[[["d"]],[["ef","g"],["a","bc"]],[["hijk"]]]
What you want is entirely reasonable: bucket some list of things using the given criterion. But alas
*Main> bucketBy (maximum . map length) [["a","bc"],["d"],["ef","g"],["hijk"]]
<interactive>:1:26:
Couldn't match expected type `Integer' with actual type `[a0]'
Expected type: Integer -> Integer
Actual type: [a0] -> Int
In the first argument of `map', namely `length'
In the second argument of `(.)', namely `map length'
Again, you may be tempted to write bucketByString, but by this point, you’re ready to move away and become a shoe cobbler.
The typechecker is your friend. Go back to your definition of bucketBy that’s specific to Integer lists, simply comment out the type annotation and ask its type.
*Main> :t bucketBy
bucketBy :: Ord k => (b -> k) -> [b] -> [[b]]
Now you can apply bucketBy for the different cases above and get the expected results. You were already in paradise but didn’t know it!
Now, in keeping with good style, you provide annotations for the toplevel definition of bucketBy to help the poor reader, perhaps yourself. Note that you must provide the Ord constraint due to the use of insertWith', whose type is
insertWith' :: Ord k => (a -> a -> a) -> k -> a -> Map k a -> Map k a
You may want to be really explicit and give an annotation for the inner go, but this requires use of the scoped type variables extension.
{-# LANGUAGE ScopedTypeVariables #-}
import Data.List (foldl')
import Data.Map (Map,elems,empty,insertWith')
bucketBy :: forall a b. Ord b => (a -> b) -> [a] -> [[a]]
bucketBy eq = elems . foldl' go empty
where go :: Map b [a] -> a -> Map b [a]
go m l = insertWith' (++) (eq l) [l] m
Without the extension and with a type annotation of
bucketBy :: Ord b => (a -> b) -> [a] -> [[a]]
the typechecker will fail with errors of the form
Could not deduce (b ~ b1)
from the context (Ord b)
bound by the type signature for
bucketBy :: Ord b => (a -> b) -> [a] -> [[a]]
at prog.hs:(10,1)-(12,46)
`b' is a rigid type variable bound by
the type signature for
bucketBy :: Ord b => (a -> b) -> [a] -> [[a]]
at prog.hs:10:1
`b1' is a rigid type variable bound by
the type signature for go :: Map b1 [a1] -> a1 -> Map b1 [a1]
at prog.hs:12:9
In the return type of a call of `eq'
In the second argument of `insertWith'', namely `(eq l)'
In the expression: insertWith' (++) (eq l) [l] m
This is because the typechecker treats the b on the inner type annotation as a distinct and entirely unrelated type b1 even though a human reader plainly sees the intent that they be the same type.
Read the scoped type variables documentation for details.
One last small surprise
You may wonder where the outer layer of brackets went. Notice that the type annotation generalized from
bucketBy :: ([Integer] -> Integer) -> [[Integer]] -> [[[Integer]]]
to
bucketBy :: forall a b. Ord b => (a -> b) -> [a] -> [[a]]
Note that [Integer] is itself another type, represented here as a.
groupBy splits the list into chunks of adjacent elements satisfying the given predicate. Since in your case, the [0,3] is separated from the [1,2] and [2,1], the first group includes only these. To collect all elements of the list having the same sum into one group, you need some preprocessing, e.g. with sortBy.
import Data.List
import Data.Function
import Data.Ord
groupBySum :: Num a => [[a]] -> [[[a]]]
groupBySum xss = groups
where
ys = map (\xs -> (sum xs,xs)) xss
sortedSums = sortBy (comparing fst) ys
groupedSums = groupBy ((==) `on` fst) sortedSums
groups = map (map snd) groupedSums
From hackage:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument.
groupBy is the same, except that you can specify your equality test. Thus, since in your input list [0,3] is not adjacent to [1,2] or [2,1], it is put on its own.