Haskell: Sort using Monoid and Foldable - haskell

I am trying to implement sorting using Monoid and Foldable. This is what I have so far. It is really slow. However, when I write the same functions without Monoid or Foldable, it is reasonably fast. Any pointers as to what I am doing wrong here would be greatly appreciated.
newtype MergeL a = MergeL { getMergeL :: [a] } deriving (Eq, Show)
instance Ord a => Monoid (MergeL a) where
mempty = MergeL []
mappend l r = MergeL $ merge (getMergeL l) (getMergeL r)
comp :: a -> MergeL a
comp a = MergeL [a]
instance Foldable MergeL where
foldMap f xs =
case divide xs of
(MergeL [], MergeL []) -> mempty
(MergeL l , MergeL []) -> foldMap f l
(MergeL [], MergeL r) -> foldMap f r
(MergeL l , MergeL r) -> foldMap f l <> foldMap f r
divide :: MergeL a -> (MergeL a, MergeL a)
-- now uses leftHalf and rightHalf
divide xs = (MergeL $ leftHalf ls, MergeL $ rightHalf ls)
where
ls = getMergeL xs
foldSort :: (Ord a, Foldable t) => t a -> [a]
foldSort = getMergeL . foldMap comp
mon :: Integer -> IO ()
mon n = (print . last . getMergeL . foldMap comp) $ MergeL [n,n - 1 ..0]
Shared helper functions:
leftHalf :: [a] -> [a]
leftHalf xs = take (length xs `div` 2) xs
rightHalf :: [a] -> [a]
rightHalf xs = drop (length xs `div` 2) xs
merge :: Ord a => [a] -> [a] -> [a]
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys)
| (x <= y) = x:(merge xs (y:ys))
| otherwise = y:(merge (x:xs) ys)
Here is the implementation of the the sort function without Monoid. It uses the same leftHalf and rightHalf for spliting the list and the same merge for merging the lists:
mergesort :: Ord a => [a] -> [a]
mergesort [] = []
mergesort [x] = [x]
mergesort xs = merge (mergesort (leftHalf xs)) (mergesort (rightHalf xs))
plain :: Integer -> IO ()
plain n = (print . last . mergesort) [n,n - 1 ..0]
The difference in performance is:
λ> mon 4000
4000
(2.20 secs, 1,328,105,368 bytes)
λ> plain 4000
4000
(0.03 secs, 11,130,816 bytes)

The main problem here is quite easy to miss (in fact, I overlooked it until I threw in a trace in divide). One of your foldMap cases is:
(MergeL l , MergeL r) -> foldMap f l <> foldMap f r
There, foldMap is being called on l and r, which are plain lists, as opposed to MergeL-wrapped lists. That being so, l and r are not divided; rather, they are merged element by element. As a consequence, the sorting becomes quadratic.
In addition to using the MergeL foldMap recursively, fixing the instance also requires adding extra cases for single element lists, as dividing them is as problematic as dividing empty lists:
instance Foldable MergeL where
foldMap f xs =
case divide xs of
(MergeL [], MergeL []) -> mempty
(ml, MergeL [y]) -> foldMap f ml <> f y
(MergeL [x], mr) -> f x <> foldMap f mr
(ml, MergeL []) -> foldMap f ml
(MergeL [], mr) -> foldMap f mr
(ml, mr) -> foldMap f ml <> foldMap f mr
This gives acceptable performance -- same complexity and order of magnitude of timings than the plain implementation without optimisations, and about the same performance with optimisations.

Related

Can `foldr` and `foldl` be defined in terms of each other?

Can foldr and foldl be defined in terms of each other?
Programming in Haskell by Hutton says
What do we need to define manually? The minimal complete definition for an instance of the
Foldable class is to define either foldMap or foldr, as all other functions in the class can be derived
from either of these two using the default definitions and the instance for lists.
So how can foldl be defined in terms of foldr?
Can foldr be defined in terms of foldl, so that we can define a Foldable type by defining foldl?
Why is it that in Foldable, fold is defined in terms of foldMap which is defined in terms of foldr, while in list foldable, some specializations of fold are defined in terms of foldl as:
maximum :: Ord a => [a] -> a
maximum = foldl max
minimum :: Ord a => [a] -> a
minimum = foldl min
sum :: Num a => [a] -> a
sum = foldl (+) 0
product :: Num a => [a] -> a
product = foldl (*) 1
? Can they be rewritten as
maximum :: Ord a => [a] -> a
maximum = foldr max
minimum :: Ord a => [a] -> a
minimum = foldr min
sum :: Num a => [a] -> a
sum = foldr (+) 0
product :: Num a => [a] -> a
product = foldr (*) 1
Thanks.
In general, neither foldr nor foldl can be implemented in terms of each other. The core operation of Foldable is foldMap, from which all the other operations may be derived. Neither foldr nor foldl are enough. However, the difference only shines through in the case of infinite or (partially) undefined structures, so there's a tendency to gloss over this fact.
#DamianLattenero has shown the "implementations" of foldl and foldr in terms of one another:
foldl' c = foldr (flip c)
foldr' c = foldl (flip c)
But they do not always have the correct behavior. Consider lists. Then, foldr (:) [] xs = xs for all xs :: [a]. However, foldr' (:) [] /= xs for all xs, because foldr' (:) [] xs = foldl (flip (:)) n xs, and foldl (in the case of lists) has to walk the entire spine of the list before it can produce an output. But, if xs is infinite, foldl can't walk the entire infinite list, so foldr' (:) [] xs loops forever for infinite xs, while foldr (:) [] xs just produces xs. foldl' = foldl as desired, however. Essentially, for [], foldr is "natural" and foldl is "unnatural". Implementing foldl with foldr works because you're just losing "naturalness", but implementing foldr in terms of foldl doesn't work, because you cannot recover that "natural" behavior.
On the flipside, consider
data Tsil a = Lin | Snoc (Tsil a) a
-- backwards version of data [a] = [] | (:) a [a]
In this case, foldl is natural:
foldl c n Lin = n
foldl c n (Snoc xs x) = c (foldl c n xs) x
And foldr is unnatural:
foldr c = foldl (flip c)
Now, foldl has the good, "productive" behavior on infinite/partially undefined Tsils, while foldr does not. Implementing foldr in terms of foldl works (as I just did above), but you cannot implement foldl in terms of foldr, because you cannot recover that productivity.
foldMap avoids this issue. For []:
foldMap f [] = mempty
foldMap f (x : xs) = f x <> foldMap f xs
-- foldMap f = foldr (\x r -> f x <> r) mempty
And for Tsil:
foldMap f Lin = mempty
foldMap f (Snoc xs x) = foldMap f xs <> f x
-- foldMap f = foldl (\r x -> r <> f x) mempty
Now,
instance Semigroup [a] where
[] <> ys = ys
(x : xs) <> ys = x : (xs <> ys)
-- (<>) = (++)
instance Monoid [a] where mempty = []
instance Semigroup (Tsil a) where
ys <> Lin = ys
ys <> (Snoc xs x) = Snoc (ys <> xs) x
instance Monoid (Tsil a) where mempty = Lin
And we have
foldMap (: []) xs = xs -- even for infinite xs
foldMap (Snoc Lin) xs = xs -- even for infinite xs
Implementations for foldl and foldr are actually given in the documentation
foldr f z t = appEndo (foldMap (Endo . f) t ) z
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z
f is used to turn each a in the t a into a b -> b (Endo b), and then all the b -> bs are composed together (foldr does it one way, while foldl composes them backwards with Dual (Endo b)) and the final b -> b is then applied to the initial value z :: b.
foldr is replaced with foldl in specializations sum, minimum, etc. in the instance Foldable [], for performance reasons. The idea is that you can't take the sum of an infinite list anyway (this assumption is false, but it's generally true enough), so we don't need foldr to handle it. Using foldl is, in some cases, more performant than foldr, so foldr is changed to foldl. I would expect, for Tsil, that foldr is sometimes more performant than foldl, and therefore sum, minimum, etc. can be reimplemented in terms of foldr, instead of fold in order to get that performance improvement. Note that the documentation says that sum, minimum, etc. should be equivalent to the forms using foldMap/fold, but may be less defined, which is exactly what would happen.
Bit of an appendix, but I think it's worth noticing that:
genFoldr c n [] = n; genFoldr c n (x : xs) = c x (genFoldr c n xs)
instance Foldable [] where
foldl c = genFoldr (flip c)
foldr c = foldl (flip c)
-- similarly for Tsil
is actually a valid, lawful Foldable instance, where both foldr and foldl are unnatural and neither can handle infinite structures (foldMap is defaulted in terms of foldr, and thus won't handle infinite lists either). In this case, foldr and foldl can be written in terms of each other (foldl c = foldr (flip c), though it is implemented with genFoldr). However, this instance is undesirable, because we would really like a foldr that can handle infinite lists, so we instead implement
instance Foldable [] where
foldr = genFoldr
foldl c = foldr (flip c)
where the equality foldr c = foldl (flip c) no longer holds.
Here's a type for which neither foldl nor foldr can be implemented in terms of the other:
import Data.Functor.Reverse
import Data.Monoid
data DL a = DL [a] (Reverse [] a)
deriving Foldable
The Foldable implementation looks like
instance Foldable DL where
foldMap f (DL front rear) = foldMap f front <> foldMap f rear
Inlining the Foldable instance for Reverse [], and adding the corresponding foldr and foldl,
foldMap f (DL front rear) = foldMap f front <> getDual (foldMap (Dual . f) (getReverse rear))
foldr c n (DL xs (Reverse ys)) =
foldr c (foldl (flip c) n ys) xs
foldl f b (DL xs (Reverse ys)) =
foldr (flip f) (foldl f b xs) ys
If the front list is infinite, then foldr defined using foldl won't work. If the rear list is infinite, then foldl defined using foldr won't work.
In the case of lists: foldl can be defined in terms of foldr but not vice-versa.
foldl f a l = foldr (\b e c -> e (f c b)) id l a
For other types which implement Foldable: the opposite may be true.
Edit 2:
There is another way also that satisfy (based on this article) for foldl:
foldl f a list = (foldr construct (\acc -> acc) list) a
where
construct x r = \acc -> r (f acc x)
Edit 1
Flipping the arguments of the function will not create a same foldr/foldl, meaning this examples does not satisfy the equality of foldr-foldl:
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
and foldl in terms of foldr:
foldl' :: Foldable t => (b -> a -> b) -> b -> t a -> b
foldl' f b = foldr (flip f) b
and foldr:
foldr' :: Foldable t => (a -> b -> b) -> b -> t a -> b
foldr' f b = foldl (flip f) b
The converse is not true, since foldr may work on infinite lists, which foldl variants never can do. However, for finite lists, foldr can also be written in terms of foldl although losing laziness in the process. (for more check here)
Ando also not satisfy this examples:
foldr (-) 2 [8,10] = 8 - (10 - 2) == 0
foldl (flip (-)) 2 [8,10] = (flip (-) (flip (-) 2 8) 10) == 4

List Nested Data Type Sum

I have this type
data List a = EmptyL | ConsL a (List (a,a))
and I wrote this function
lenL :: List a -> Int
lenL EmptyL = 0
lenL (ConsL x xs) = 1 + lenL xs
Can I write a function like this?
sumL :: List Int -> Int
How?
Sure:
data List a = EmptyL | ConsL a (List (a,a))
pair f (x, y) = (f x, f y)
nest :: (a -> b) -> List a -> List b
nest f EmptyL = EmptyL
nest f (ConsL x xs) = ConsL (f x) (nest (pair f) xs)
sumL :: List Int -> Int
sumL EmptyL = 0
sumL (ConsL x xs) = x + sumL (nest (uncurry (+)) xs)
We have:
*Main> sumL EmptyL
0
*Main> sumL (ConsL 1 EmptyL)
1
*Main> sumL (ConsL 1 (ConsL (2, 3) EmptyL))
6
The "magic" is explained in: http://www.cs.ox.ac.uk/jeremy.gibbons/publications/efolds.pdf
For completeness, here's a full definition in terms of the generalized fold as described in the paper:
import Prelude hiding (sum, fold)
data List a = EmptyL | ConsL (a, List (a, a))
nest :: (a -> b) -> List a -> List b
nest f EmptyL = EmptyL
nest f (ConsL (x, xs)) = ConsL (f x, nest (pair f) xs)
pair :: (a -> b) -> (a, a) -> (b, b)
pair f (x, y) = (f x, f y)
fold :: a -> ((b, a) -> a) -> ((b, b) -> b) -> List b -> a
fold e f g EmptyL = e
fold e f g (ConsL (x, xs)) = f (x, fold e f g (nest g xs))
sum :: List Int -> Int
sum = fold 0 (uncurry (+)) (uncurry (+))
The data type you have is not really for lists, more like complete binary trees. You can convert the trees you have to ordinary lists like this:
toList :: List a -> [a]
toList EmptyL = []
toList (ConsL x xs) = x:uncurry (++) (unzip (toList xs))
Not the most efficient code and the ordering is a bit arbitrary, but it should work. If you want the sum or anything else you can just use sum . toList.
Note that your lenL function does not compute the length of the resulting list, but rather the depth of the original tree. If you want the number of elements in the tree you can use length . toList.
Since sum is a method of Foldable, let's see how we'd implement foldMap:
data List a = EmptyL | ConsL a (List (a,a))
instance Foldable List where
foldMap _ EmptyL = mempty
foldMap f (ConsL a as) = f a <> foldMap (\(x,y) -> f x <> f y) as
We can write sumL = getSum . foldMap Sum.

Recursion scheme in Haskell for repeatedly breaking datatypes into "head" and "tail" and yielding a structure of results

In Haskell, I recently found the following function useful:
listCase :: (a -> [a] -> b) -> [a] -> [b]
listCase f [] = []
listCase f (x:xs) = f x xs : listCase f xs
I used it to generate sliding windows of size 3 from a list, like this:
*Main> listCase (\_ -> take 3) [1..5]
[[2,3,4],[3,4,5],[4,5],[5],[]]
Is there a more general recursion scheme which captures this pattern? More specifically, that allows you to generate a some structure of results by repeatedly breaking data into a "head" and "tail"?
What you are asking for is a comonad. This may sound scarier than monad, but is a simpler concept (YMMV).
Comonads are Functors with additional structure:
class Functor w => Comonad w where
extract :: w a -> a
duplicate :: w a -> w (w a)
extend :: (w a -> b) -> w a -> w b
(extendand duplicate can be defined in terms of each other)
and laws similar to the monad laws:
duplicate . extract = id
duplicate . fmap extract = id
duplicate . duplicate = fmap duplicate . duplicate
Specifically, the signature (a -> [a] -> b) takes non-empty Lists of type a. The usual type [a] is not an instance of a comonad, but the non-empty lists are:
data NE a = T a | a :. NE a deriving Functor
instance Comonad NE where
extract (T x) = x
extract (x :. _) = x
duplicate z#(T _) = T z
duplicate z#(_ :. xs) = z :. duplicate xs
The comonad laws allow only this instance for non-empty lists (actually a second one).
Your function then becomes
extend (take 3 . drop 1 . toList)
Where toList :: NE a -> [a] is obvious.
This is worse than the original, but extend can be written as =>> which is simpler if applied repeatedly.
For further information, you may start at What is the Comonad typeclass in Haskell?.
This looks like a special case of a (jargon here but it can help with googling) paramorphism, a generalisation of primitive recursion to all initial algebras.
Reimplementing ListCase
Let's have a look at how to reimplement your function using such a combinator. First we define the notion of paramorphism: a recursion principle where not only the result of the recursive call is available but also the entire substructure this call was performed on:
The type of paraList tells me that in the (:) case, I will have access to the head, the tail and the value of the recursive call on the tail and that I need to provide a value for the base case.
module ListCase where
paraList :: (a -> [a] -> b -> b) -- cons
-> b -- nil
-> [a] -> b -- resulting function on lists
paraList c n [] = n
paraList c n (x : xs) = c x xs $ paraList c n xs
We can now give an alternative definition of listCase:
listCase' :: (a -> [a] -> b) -> [a] -> [b]
listCase' c = paraList (\ x xs tl -> c x xs : tl) []
Considering the general case
In the general case, we are interested in building a definition of paramorphism for all data structures defined as the fixpoint of a (strictly positive) functor. We use the traditional fixpoint operator:
newtype Fix f = Fix { unFix :: f (Fix f) }
This builds an inductive structure layer by layer. The layers have an f shape which maybe better grasped by recalling the definition of List using this formalism. A layer is either Nothing (we're done!) or Just (head, tail):
newtype ListF a as = ListF { unListF :: Maybe (a, as) }
type List a = Fix (ListF a)
nil :: List a
nil = Fix $ ListF $ Nothing
cons :: a -> List a -> List a
cons = curry $ Fix . ListF .Just
Now that we have this general framework, we can define para generically for all Fix f where f is a functor:
para :: Functor f => (f (Fix f, b) -> b) -> Fix f -> b
para alg = alg . fmap (\ rec -> (rec, para alg rec)) . unFix
Of course, ListF a is a functor. Meaning we could use para to reimplement paraList and listCase.
instance Functor (ListF a) where fmap f = ListF . fmap (fmap f) . unListF
paraList' :: (a -> List a -> b -> b) -> b -> List a -> b
paraList' c n = para $ maybe n (\ (a, (as, b)) -> c a as b) . unListF
listCase'' :: (a -> List a -> b) -> List a -> List b
listCase'' c = paraList' (\ x xs tl -> cons (c x xs) tl) nil
You can implement a simple bijection toList, fromList to test it if you want. I could not be bothered to reimplement take so it's pretty ugly:
toList :: [a] -> List a
toList = foldr cons nil
fromList :: List a -> [a]
fromList = paraList' (\ x _ tl -> x : tl) []
*ListCase> fmap fromList . fromList . listCase'' (\ _ as -> toList $ take 3 $ fromList as). toList $ [1..5]
[[2,3,4],[3,4,5],[4,5],[5],[]]

Why can you reverse list with foldl, but not with foldr in Haskell

Why can you reverse a list with the foldl?
reverse' :: [a] -> [a]
reverse' xs = foldl (\acc x-> x : acc) [] xs
But this one gives me a compile error.
reverse' :: [a] -> [a]
reverse' xs = foldr (\acc x-> x : acc) [] xs
Error
Couldn't match expected type `a' with actual type `[a]'
`a' is a rigid type variable bound by
the type signature for reverse' :: [a] -> [a] at foldl.hs:33:13
Relevant bindings include
x :: [a] (bound at foldl.hs:34:27)
acc :: [a] (bound at foldl.hs:34:23)
xs :: [a] (bound at foldl.hs:34:10)
reverse' :: [a] -> [a] (bound at foldl.hs:34:1)
In the first argument of `(:)', namely `x'
In the expression: x : acc
Every foldl is a foldr.
Let's remember the definitions.
foldr :: (a -> s -> s) -> s -> [a] -> s
foldr f s [] = s
foldr f s (a : as) = f a (foldr f s as)
That's the standard issue one-step iterator for lists. I used to get my students to bang on the tables and chant "What do you do with the empty list? What do you do with a : as"? And that's how you figure out what s and f are, respectively.
If you think about what's happening, you see that foldr effectively computes a big composition of f a functions, then applies that composition to s.
foldr f s [1, 2, 3]
= f 1 . f 2 . f 3 . id $ s
Now, let's check out foldl
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t [] = t
foldl g t (a : as) = foldl g (g t a) as
That's also a one-step iteration over a list, but with an accumulator which changes as we go. Let's move it last, so that everything to the left of the list argument stays the same.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) [] t = t
flip (foldl g) (a : as) t = flip (foldl g) as (g t a)
Now we can see the one-step iteration if we move the = one place leftward.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) [] = \ t -> t
flip (foldl g) (a : as) = \ t -> flip (foldl g) as (g t a)
In each case, we compute what we would do if we knew the accumulator, abstracted with \ t ->. For [], we would return t. For a : as, we would process the tail with g t a as the accumulator.
But now we can transform flip (foldl g) into a foldr. Abstract out the recursive call.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) [] = \ t -> t
flip (foldl g) (a : as) = \ t -> s (g t a)
where s = flip (foldl g) as
And now we're good to turn it into a foldr where type s is instantiated with t -> t.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) = foldr (\ a s -> \ t -> s (g t a)) (\ t -> t)
So s says "what as would do with the accumulator" and we give back \ t -> s (g t a) which is "what a : as does with the accumulator". Flip back.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g = flip (foldr (\ a s -> \ t -> s (g t a)) (\ t -> t))
Eta-expand.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = flip (foldr (\ a s -> \ t -> s (g t a)) (\ t -> t)) t as
Reduce the flip.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = foldr (\ a s -> \ t -> s (g t a)) (\ t -> t) as t
So we compute "what we'd do if we knew the accumulator", and then we feed it the initial accumulator.
It's moderately instructive to golf that down a little. We can get rid of \ t ->.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = foldr (\ a s -> s . (`g` a)) id as t
Now let me reverse that composition using >>> from Control.Arrow.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = foldr (\ a s -> (`g` a) >>> s) id as t
That is, foldl computes a big reverse composition. So, for example, given [1,2,3], we get
foldr (\ a s -> (`g` a) >>> s) id [1,2,3] t
= ((`g` 1) >>> (`g` 2) >>> (`g` 3) >>> id) t
where the "pipeline" feeds its argument in from the left, so we get
((`g` 1) >>> (`g` 2) >>> (`g` 3) >>> id) t
= ((`g` 2) >>> (`g` 3) >>> id) (g t 1)
= ((`g` 3) >>> id) (g (g t 1) 2)
= id (g (g (g t 1) 2) 3)
= g (g (g t 1) 2) 3
and if you take g = flip (:) and t = [] you get
flip (:) (flip (:) (flip (:) [] 1) 2) 3
= flip (:) (flip (:) (1 : []) 2) 3
= flip (:) (2 : 1 : []) 3
= 3 : 2 : 1 : []
= [3, 2, 1]
That is,
reverse as = foldr (\ a s -> (a :) >>> s) id as []
by instantiating the general transformation of foldl to foldr.
For mathochists only. Do cabal install newtype and import Data.Monoid, Data.Foldable and Control.Newtype. Add the tragically missing instance:
instance Newtype (Dual o) o where
pack = Dual
unpack = getDual
Observe that, on the one hand, we can implement foldMap by foldr
foldMap :: Monoid x => (a -> x) -> [a] -> x
foldMap f = foldr (mappend . f) mempty
but also vice versa
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f = flip (ala' Endo foldMap f)
so that foldr accumulates in the monoid of composing endofunctions, but now to get foldl, we tell foldMap to work in the Dual monoid.
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl g = flip (ala' Endo (ala' Dual foldMap) (flip g))
What is mappend for Dual (Endo b)? Modulo wrapping, it's exactly the reverse composition, >>>.
For a start, the type signatures don't line up:
foldl :: (o -> i -> o) -> o -> [i] -> o
foldr :: (i -> o -> o) -> o -> [i] -> o
So if you swap your argument names:
reverse' xs = foldr (\ x acc -> x : acc) [] xs
Now it compiles. It won't work, but it compiles now.
The thing is, foldl, works from left to right (i.e., backwards), whereas foldr works right to left (i.e., forwards). And that's kind of why foldl lets you reverse a list; it hands you stuff in reverse order.
Having said all that, you can do
reverse' xs = foldr (\ x acc -> acc ++ [x]) [] xs
It'll be really slow, however. (Quadratic complexity rather than linear complexity.)
You can use foldr to reverse a list efficiently (well, most of the time in GHC 7.9—it relies on some compiler optimizations), but it's a little weird:
reverse xs = foldr (\x k -> \acc -> k (x:acc)) id xs []
I wrote an explanation of how this works on the Haskell Wiki.
foldr basically deconstructs a list, in the canonical way: foldr f initial is the same as a function with patterns:(this is basically the definition of foldr)
ff [] = initial
ff (x:xs) = f x $ ff xs
i.e. it un-conses the elements one by one and feeds them to f. Well, if all f does is cons them back again, then you get the list you originally had! (Another way to say that: foldr (:) [] ≡ id.
foldl "deconstructs" the list in inverse order, so if you cons back the elements you get the reverse list. To achieve the same result with foldr, you need to append to the "wrong" end – either as MathematicalOrchid showed, inefficiently with ++, or by using a difference list:
reverse'' :: [a] -> [a]
reverse'' l = dl2list $ foldr (\x accDL -> accDL ++. (x:)) empty l
type DList a = [a]->[a]
(++.) :: DList a -> DList a -> DList a
(++.) = (.)
emptyDL :: DList a
emptyDL = id
dl2list :: DLList a -> [a]
dl2list = ($[])
Which can be compactly written as
reverse''' l = foldr (flip(.) . (:)) id l []
This is what foldl op acc does with a list with, say, 6 elements:
(((((acc `op` x1) `op` x2) `op` x3) `op` x4) `op` x5 ) `op` x6
while foldr op acc does this:
x1 `op` (x2 `op` (x3 `op` (x4 `op` (x5 `op` (x6 `op` acc)))))
When you look at this, it becomes clear that if you want foldl to reverse the list, op should be a "stick the right operand to the beginning of the left operand" operator. Which is just (:) with arguments reversed, i.e.
reverse' = foldl (flip (:)) []
(this is the same as your version but using built-in functions).
When you want foldr to reverse the list, you need a "stick the left operand to the end of the right operand" operator. I don't know of a built-in function that does that; if you want you can write it as flip (++) . return.
reverse'' = foldr (flip (++) . return) []
or if you prefer to write it yourself
reverse'' = foldr (\x acc -> acc ++ [x]) []
This would be slow though.
A slight but significant generalization of several of these answers is that you can implement foldl with foldr, which I think is a clearer way of explaining what's going on in them:
myMap :: (a -> b) -> [a] -> [b]
myMap f = foldr step []
where step a bs = f a : bs
-- To fold from the left, we:
--
-- 1. Map each list element to an *endomorphism* (a function from one
-- type to itself; in this case, the type is `b`);
--
-- 2. Take the "flipped" (left-to-right) composition of these
-- functions;
--
-- 3. Apply the resulting function to the `z` argument.
--
myfoldl :: (b -> a -> b) -> b -> [a] -> b
myfoldl f z as = foldr (flip (.)) id (toEndos f as) z
where
toEndos :: (b -> a -> b) -> [a] -> [b -> b]
toEndos f = myMap (flip f)
myReverse :: [a] -> [a]
myReverse = myfoldl (flip (:)) []
For more explanation of the ideas here, I'd recommend reading Tom Ellis' "What is foldr made of?" and Brent Yorgey's "foldr is made of monoids".

How to partition a list in Haskell?

I want to take a list (or a string) and split it into sub-lists of N elements. How do I do it in Haskell?
Example:
mysteryFunction 2 "abcdefgh"
["ab", "cd", "ef", "gh"]
cabal update
cabal install split
And then use chunksOf from Data.List.Split
Here's one option:
partition :: Int -> [a] -> [[a]]
partition _ [] = []
partition n xs = (take n xs) : (partition n (drop n xs))
And here's a tail recursive version of that function:
partition :: Int -> [a] -> [[a]]
partition n xs = partition' n xs []
where
partition' _ [] acc = reverse acc
partition' n xs acc = partition' n (drop n xs) ((take n xs) : acc)
You could use:
mysteryFunction :: Int -> [a] -> [[a]]
mysteryFunction n list = unfoldr takeList list
where takeList [] = Nothing
takeList l = Just $ splitAt n l
or alternatively:
mysteryFunction :: Int -> [a] -> [[a]]
mysteryFunction n list = unfoldr (\l -> if null l then Nothing else Just $ splitAt n l) list
Note this puts any remaining elements in the last list, for example
mysteryFunction 2 "abcdefg" = ["ab", "cd", "ef", "g"]
import Data.List
import Data.Function
mysteryFunction n = map (map snd) . groupBy ((==) `on` fst) . zip ([0..] >>= replicate n)
... just kidding...
mysteryFunction x "" = []
mysteryFunction x s = take x s : mysteryFunction x (drop x s)
Probably not the elegant solution you had in mind.
There's already
Prelude Data.List> :t either
either :: (a -> c) -> (b -> c) -> Either a b -> c
and
Prelude Data.List> :t maybe
maybe :: b -> (a -> b) -> Maybe a -> b
so there really should be
list :: t -> ([a] -> t) -> [a] -> t
list n _ [] = n
list _ c xs = c xs
as well. With it,
import Data.List (unfoldr)
g n = unfoldr $ list Nothing (Just . splitAt n)
without it,
g n = takeWhile (not.null) . unfoldr (Just . splitAt n)
A fancy answer.
In the answers above you have to use splitAt, which is recursive, too. Let's see how we can build a recursive solution from scratch.
Functor L(X)=1+A*X can map X into a 1 or split it into a pair of A and X, and has List(A) as its minimal fixed point: List(A) can be mapped into 1+A*List(A) and back using a isomorphism; in other words, we have one way to decompose a non-empty list, and only one way to represent a empty list.
Functor F(X)=List(A)+A*X is similar, but the tail of the list is no longer a empty list - "1" - so the functor is able to extract a value A or turn X into a list of As. Then List(A) is its fixed point (but no longer the minimal fixed point), the functor can represent any given list as a List, or as a pair of a element and a list. In effect, any coalgebra can "stop" decomposing the list "at will".
{-# LANGUAGE DeriveFunctor #-}
import Data.Functor.Foldable
data N a x = Z [a] | S a x deriving (Functor)
(which is the same as adding the following trivial instance):
instance Functor (N a) where
fmap f (Z xs) = Z xs
fmap f (S x y) = S x $ f y
Consider the definition of hylomorphism:
hylo :: (f b -> b) -> (c -> f c) -> c -> b
hylo psi phi = psi . fmap (hylo psi phi) . phi
Given a seed value, it uses phi to produce f c, to which fmap applies hylo psi phi recursively, and psi then extracts b from the fmapped structure f b.
A hylomorphism for the pair of (co)algebras for this functor is a splitAt:
splitAt :: Int -> [a] -> ([a],[a])
splitAt n xs = hylo psi phi (n, xs) where
phi (n, []) = Z []
phi (0, xs) = Z xs
phi (n, (x:xs)) = S x (n-1, xs)
This coalgebra extracts a head, as long as there is a head to extract and the counter of extracted elements is not zero. This is because of how the functor was defined: as long as phi produces S x y, hylo will feed y into phi as the next seed; once Z xs is produced, functor no longer applies hylo psi phi to it, and the recursion stops.
At the same time hylo will re-map the structure into a pair of lists:
psi (Z ys) = ([], ys)
psi (S h (t, b)) = (h:t, b)
So now we know how splitAt works. We can extend that to splitList using apomorphism:
splitList :: Int -> [a] -> [[a]]
splitList n xs = apo (hylo psi phi) (n, xs) where
phi (n, []) = Z []
phi (0, xs) = Z xs
phi (n, (x:xs)) = S x (n-1, xs)
psi (Z []) = Cons [] $ Left []
psi (Z ys) = Cons [] $ Right (n, ys)
psi (S h (Cons t b)) = Cons (h:t) b
This time the re-mapping is fitted for use with apomorphism: as long as it is Right, apomorphism will keep using hylo psi phi to produce the next element of the list; if it is Left, it produces the rest of the list in one step (in this case, just finishes off the list with []).

Resources