List Nested Data Type Sum - haskell

I have this type
data List a = EmptyL | ConsL a (List (a,a))
and I wrote this function
lenL :: List a -> Int
lenL EmptyL = 0
lenL (ConsL x xs) = 1 + lenL xs
Can I write a function like this?
sumL :: List Int -> Int
How?

Sure:
data List a = EmptyL | ConsL a (List (a,a))
pair f (x, y) = (f x, f y)
nest :: (a -> b) -> List a -> List b
nest f EmptyL = EmptyL
nest f (ConsL x xs) = ConsL (f x) (nest (pair f) xs)
sumL :: List Int -> Int
sumL EmptyL = 0
sumL (ConsL x xs) = x + sumL (nest (uncurry (+)) xs)
We have:
*Main> sumL EmptyL
0
*Main> sumL (ConsL 1 EmptyL)
1
*Main> sumL (ConsL 1 (ConsL (2, 3) EmptyL))
6
The "magic" is explained in: http://www.cs.ox.ac.uk/jeremy.gibbons/publications/efolds.pdf
For completeness, here's a full definition in terms of the generalized fold as described in the paper:
import Prelude hiding (sum, fold)
data List a = EmptyL | ConsL (a, List (a, a))
nest :: (a -> b) -> List a -> List b
nest f EmptyL = EmptyL
nest f (ConsL (x, xs)) = ConsL (f x, nest (pair f) xs)
pair :: (a -> b) -> (a, a) -> (b, b)
pair f (x, y) = (f x, f y)
fold :: a -> ((b, a) -> a) -> ((b, b) -> b) -> List b -> a
fold e f g EmptyL = e
fold e f g (ConsL (x, xs)) = f (x, fold e f g (nest g xs))
sum :: List Int -> Int
sum = fold 0 (uncurry (+)) (uncurry (+))

The data type you have is not really for lists, more like complete binary trees. You can convert the trees you have to ordinary lists like this:
toList :: List a -> [a]
toList EmptyL = []
toList (ConsL x xs) = x:uncurry (++) (unzip (toList xs))
Not the most efficient code and the ordering is a bit arbitrary, but it should work. If you want the sum or anything else you can just use sum . toList.
Note that your lenL function does not compute the length of the resulting list, but rather the depth of the original tree. If you want the number of elements in the tree you can use length . toList.

Since sum is a method of Foldable, let's see how we'd implement foldMap:
data List a = EmptyL | ConsL a (List (a,a))
instance Foldable List where
foldMap _ EmptyL = mempty
foldMap f (ConsL a as) = f a <> foldMap (\(x,y) -> f x <> f y) as
We can write sumL = getSum . foldMap Sum.

Related

Haskell: Sort using Monoid and Foldable

I am trying to implement sorting using Monoid and Foldable. This is what I have so far. It is really slow. However, when I write the same functions without Monoid or Foldable, it is reasonably fast. Any pointers as to what I am doing wrong here would be greatly appreciated.
newtype MergeL a = MergeL { getMergeL :: [a] } deriving (Eq, Show)
instance Ord a => Monoid (MergeL a) where
mempty = MergeL []
mappend l r = MergeL $ merge (getMergeL l) (getMergeL r)
comp :: a -> MergeL a
comp a = MergeL [a]
instance Foldable MergeL where
foldMap f xs =
case divide xs of
(MergeL [], MergeL []) -> mempty
(MergeL l , MergeL []) -> foldMap f l
(MergeL [], MergeL r) -> foldMap f r
(MergeL l , MergeL r) -> foldMap f l <> foldMap f r
divide :: MergeL a -> (MergeL a, MergeL a)
-- now uses leftHalf and rightHalf
divide xs = (MergeL $ leftHalf ls, MergeL $ rightHalf ls)
where
ls = getMergeL xs
foldSort :: (Ord a, Foldable t) => t a -> [a]
foldSort = getMergeL . foldMap comp
mon :: Integer -> IO ()
mon n = (print . last . getMergeL . foldMap comp) $ MergeL [n,n - 1 ..0]
Shared helper functions:
leftHalf :: [a] -> [a]
leftHalf xs = take (length xs `div` 2) xs
rightHalf :: [a] -> [a]
rightHalf xs = drop (length xs `div` 2) xs
merge :: Ord a => [a] -> [a] -> [a]
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys)
| (x <= y) = x:(merge xs (y:ys))
| otherwise = y:(merge (x:xs) ys)
Here is the implementation of the the sort function without Monoid. It uses the same leftHalf and rightHalf for spliting the list and the same merge for merging the lists:
mergesort :: Ord a => [a] -> [a]
mergesort [] = []
mergesort [x] = [x]
mergesort xs = merge (mergesort (leftHalf xs)) (mergesort (rightHalf xs))
plain :: Integer -> IO ()
plain n = (print . last . mergesort) [n,n - 1 ..0]
The difference in performance is:
λ> mon 4000
4000
(2.20 secs, 1,328,105,368 bytes)
λ> plain 4000
4000
(0.03 secs, 11,130,816 bytes)
The main problem here is quite easy to miss (in fact, I overlooked it until I threw in a trace in divide). One of your foldMap cases is:
(MergeL l , MergeL r) -> foldMap f l <> foldMap f r
There, foldMap is being called on l and r, which are plain lists, as opposed to MergeL-wrapped lists. That being so, l and r are not divided; rather, they are merged element by element. As a consequence, the sorting becomes quadratic.
In addition to using the MergeL foldMap recursively, fixing the instance also requires adding extra cases for single element lists, as dividing them is as problematic as dividing empty lists:
instance Foldable MergeL where
foldMap f xs =
case divide xs of
(MergeL [], MergeL []) -> mempty
(ml, MergeL [y]) -> foldMap f ml <> f y
(MergeL [x], mr) -> f x <> foldMap f mr
(ml, MergeL []) -> foldMap f ml
(MergeL [], mr) -> foldMap f mr
(ml, mr) -> foldMap f ml <> foldMap f mr
This gives acceptable performance -- same complexity and order of magnitude of timings than the plain implementation without optimisations, and about the same performance with optimisations.

Why can you reverse list with foldl, but not with foldr in Haskell

Why can you reverse a list with the foldl?
reverse' :: [a] -> [a]
reverse' xs = foldl (\acc x-> x : acc) [] xs
But this one gives me a compile error.
reverse' :: [a] -> [a]
reverse' xs = foldr (\acc x-> x : acc) [] xs
Error
Couldn't match expected type `a' with actual type `[a]'
`a' is a rigid type variable bound by
the type signature for reverse' :: [a] -> [a] at foldl.hs:33:13
Relevant bindings include
x :: [a] (bound at foldl.hs:34:27)
acc :: [a] (bound at foldl.hs:34:23)
xs :: [a] (bound at foldl.hs:34:10)
reverse' :: [a] -> [a] (bound at foldl.hs:34:1)
In the first argument of `(:)', namely `x'
In the expression: x : acc
Every foldl is a foldr.
Let's remember the definitions.
foldr :: (a -> s -> s) -> s -> [a] -> s
foldr f s [] = s
foldr f s (a : as) = f a (foldr f s as)
That's the standard issue one-step iterator for lists. I used to get my students to bang on the tables and chant "What do you do with the empty list? What do you do with a : as"? And that's how you figure out what s and f are, respectively.
If you think about what's happening, you see that foldr effectively computes a big composition of f a functions, then applies that composition to s.
foldr f s [1, 2, 3]
= f 1 . f 2 . f 3 . id $ s
Now, let's check out foldl
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t [] = t
foldl g t (a : as) = foldl g (g t a) as
That's also a one-step iteration over a list, but with an accumulator which changes as we go. Let's move it last, so that everything to the left of the list argument stays the same.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) [] t = t
flip (foldl g) (a : as) t = flip (foldl g) as (g t a)
Now we can see the one-step iteration if we move the = one place leftward.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) [] = \ t -> t
flip (foldl g) (a : as) = \ t -> flip (foldl g) as (g t a)
In each case, we compute what we would do if we knew the accumulator, abstracted with \ t ->. For [], we would return t. For a : as, we would process the tail with g t a as the accumulator.
But now we can transform flip (foldl g) into a foldr. Abstract out the recursive call.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) [] = \ t -> t
flip (foldl g) (a : as) = \ t -> s (g t a)
where s = flip (foldl g) as
And now we're good to turn it into a foldr where type s is instantiated with t -> t.
flip . foldl :: (t -> a -> t) -> [a] -> t -> t
flip (foldl g) = foldr (\ a s -> \ t -> s (g t a)) (\ t -> t)
So s says "what as would do with the accumulator" and we give back \ t -> s (g t a) which is "what a : as does with the accumulator". Flip back.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g = flip (foldr (\ a s -> \ t -> s (g t a)) (\ t -> t))
Eta-expand.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = flip (foldr (\ a s -> \ t -> s (g t a)) (\ t -> t)) t as
Reduce the flip.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = foldr (\ a s -> \ t -> s (g t a)) (\ t -> t) as t
So we compute "what we'd do if we knew the accumulator", and then we feed it the initial accumulator.
It's moderately instructive to golf that down a little. We can get rid of \ t ->.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = foldr (\ a s -> s . (`g` a)) id as t
Now let me reverse that composition using >>> from Control.Arrow.
foldl :: (t -> a -> t) -> t -> [a] -> t
foldl g t as = foldr (\ a s -> (`g` a) >>> s) id as t
That is, foldl computes a big reverse composition. So, for example, given [1,2,3], we get
foldr (\ a s -> (`g` a) >>> s) id [1,2,3] t
= ((`g` 1) >>> (`g` 2) >>> (`g` 3) >>> id) t
where the "pipeline" feeds its argument in from the left, so we get
((`g` 1) >>> (`g` 2) >>> (`g` 3) >>> id) t
= ((`g` 2) >>> (`g` 3) >>> id) (g t 1)
= ((`g` 3) >>> id) (g (g t 1) 2)
= id (g (g (g t 1) 2) 3)
= g (g (g t 1) 2) 3
and if you take g = flip (:) and t = [] you get
flip (:) (flip (:) (flip (:) [] 1) 2) 3
= flip (:) (flip (:) (1 : []) 2) 3
= flip (:) (2 : 1 : []) 3
= 3 : 2 : 1 : []
= [3, 2, 1]
That is,
reverse as = foldr (\ a s -> (a :) >>> s) id as []
by instantiating the general transformation of foldl to foldr.
For mathochists only. Do cabal install newtype and import Data.Monoid, Data.Foldable and Control.Newtype. Add the tragically missing instance:
instance Newtype (Dual o) o where
pack = Dual
unpack = getDual
Observe that, on the one hand, we can implement foldMap by foldr
foldMap :: Monoid x => (a -> x) -> [a] -> x
foldMap f = foldr (mappend . f) mempty
but also vice versa
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f = flip (ala' Endo foldMap f)
so that foldr accumulates in the monoid of composing endofunctions, but now to get foldl, we tell foldMap to work in the Dual monoid.
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl g = flip (ala' Endo (ala' Dual foldMap) (flip g))
What is mappend for Dual (Endo b)? Modulo wrapping, it's exactly the reverse composition, >>>.
For a start, the type signatures don't line up:
foldl :: (o -> i -> o) -> o -> [i] -> o
foldr :: (i -> o -> o) -> o -> [i] -> o
So if you swap your argument names:
reverse' xs = foldr (\ x acc -> x : acc) [] xs
Now it compiles. It won't work, but it compiles now.
The thing is, foldl, works from left to right (i.e., backwards), whereas foldr works right to left (i.e., forwards). And that's kind of why foldl lets you reverse a list; it hands you stuff in reverse order.
Having said all that, you can do
reverse' xs = foldr (\ x acc -> acc ++ [x]) [] xs
It'll be really slow, however. (Quadratic complexity rather than linear complexity.)
You can use foldr to reverse a list efficiently (well, most of the time in GHC 7.9—it relies on some compiler optimizations), but it's a little weird:
reverse xs = foldr (\x k -> \acc -> k (x:acc)) id xs []
I wrote an explanation of how this works on the Haskell Wiki.
foldr basically deconstructs a list, in the canonical way: foldr f initial is the same as a function with patterns:(this is basically the definition of foldr)
ff [] = initial
ff (x:xs) = f x $ ff xs
i.e. it un-conses the elements one by one and feeds them to f. Well, if all f does is cons them back again, then you get the list you originally had! (Another way to say that: foldr (:) [] ≡ id.
foldl "deconstructs" the list in inverse order, so if you cons back the elements you get the reverse list. To achieve the same result with foldr, you need to append to the "wrong" end – either as MathematicalOrchid showed, inefficiently with ++, or by using a difference list:
reverse'' :: [a] -> [a]
reverse'' l = dl2list $ foldr (\x accDL -> accDL ++. (x:)) empty l
type DList a = [a]->[a]
(++.) :: DList a -> DList a -> DList a
(++.) = (.)
emptyDL :: DList a
emptyDL = id
dl2list :: DLList a -> [a]
dl2list = ($[])
Which can be compactly written as
reverse''' l = foldr (flip(.) . (:)) id l []
This is what foldl op acc does with a list with, say, 6 elements:
(((((acc `op` x1) `op` x2) `op` x3) `op` x4) `op` x5 ) `op` x6
while foldr op acc does this:
x1 `op` (x2 `op` (x3 `op` (x4 `op` (x5 `op` (x6 `op` acc)))))
When you look at this, it becomes clear that if you want foldl to reverse the list, op should be a "stick the right operand to the beginning of the left operand" operator. Which is just (:) with arguments reversed, i.e.
reverse' = foldl (flip (:)) []
(this is the same as your version but using built-in functions).
When you want foldr to reverse the list, you need a "stick the left operand to the end of the right operand" operator. I don't know of a built-in function that does that; if you want you can write it as flip (++) . return.
reverse'' = foldr (flip (++) . return) []
or if you prefer to write it yourself
reverse'' = foldr (\x acc -> acc ++ [x]) []
This would be slow though.
A slight but significant generalization of several of these answers is that you can implement foldl with foldr, which I think is a clearer way of explaining what's going on in them:
myMap :: (a -> b) -> [a] -> [b]
myMap f = foldr step []
where step a bs = f a : bs
-- To fold from the left, we:
--
-- 1. Map each list element to an *endomorphism* (a function from one
-- type to itself; in this case, the type is `b`);
--
-- 2. Take the "flipped" (left-to-right) composition of these
-- functions;
--
-- 3. Apply the resulting function to the `z` argument.
--
myfoldl :: (b -> a -> b) -> b -> [a] -> b
myfoldl f z as = foldr (flip (.)) id (toEndos f as) z
where
toEndos :: (b -> a -> b) -> [a] -> [b -> b]
toEndos f = myMap (flip f)
myReverse :: [a] -> [a]
myReverse = myfoldl (flip (:)) []
For more explanation of the ideas here, I'd recommend reading Tom Ellis' "What is foldr made of?" and Brent Yorgey's "foldr is made of monoids".

Haskell: List Comprehensions and higher-order functions

I've tried to transform the following list comprehension:
f xs = [ x+8 | (x,_) <- xs ]
using higher-order functions.
My first solution was:
f' xs = map (\(x,_) -> x+8) xs
After I tried various other approaches, I found out that the following also works:
f' xs = map((+8).fst) xs
Both versions of f' give the same (correct) output, but I don't understand why (+8).fst is equal to \(x,_) -> x+8 when using map on a list of tuples.
The definition of fst is
fst :: (a, b) -> a
fst (a, _) = a
and the definition of (.) is
(.) :: (b -> c) -> (a -> b) -> a -> c
(f . g) = \x -> f (g x)
If we use these definitions to expand your function, we get
f' xs = map ((+8) . fst) xs
f' xs = map (\x -> (+8) (fst x)) xs -- definition of (.)
f' xs = map (\x -> (+8) ((\(a, _) -> a) x)) -- definition of fst
f' xs = map (\(a, _) -> (+8) a) -- we can move the pattern matching
f' xs = map (\(a, _) -> a + 8) -- expand section
Both versions of f' give the same (correct) output, but I don't understand why (+8).fst is equal to (x,_) -> x+8 when using map on a list of tuples.
The type of fst is:
fst :: (a, b) -> a
and what it does is it takes the first element of a pair (a tuple of two elements).
The type of (+8) is:
(+8) :: Num a => a -> a
and what it does is it takes as input a Num, applies + 8 to it and returns the result.
Now, the type of (+8) . fst is:
((+8).fst) :: Num c => (c, b) -> c
which is the composition of fst and (+8). Specifically it's the function that takes as input a pair, extracts the first element and adds 8 to it.
This can be easily seen by seen an example:
((+8).fst) (3, 'a')
-- 11
The same thing happens with \ (x, _) -> x + 8. You take a pair as input (in the lambda), pattern match the first argument to x, increment it by 8 and return it:
(\ (x, _) -> x + 8) (3, 'a')
-- 11

"generalised" scanl

I am trying to write a sort of scanl like function of type:
general_scanl' :: (a->b->a)->(a->b->[c])->a->[b]->[c]
The function is intended to output the same as the following two monstrosities:
general_scanl' f g x y = snd $ foldl' (\(p,q) r -> (f p r,q ++ g p r)) (x,[]) y
or,
general_scanl' f g x y = concat $ zipWith g (scanl f x y) y
The disadvantage of the first definition is that it contains a handwritten lambda.
The disadvantage of the second definition is that it accumulates a list of lists (scanl f x y) which isn't necessary...
My question: is there a cleaner way to define this function?
Many thanks,
You have
Prelude> let general_scanl2 f g z xs = concat $ zipWith g (scanl f z xs) xs
-- :: [a]
Prelude> :t general_scanl2
general_scanl2 :: (a -> b -> a) -- f
-> (a -> b -> [c]) -- g
-> a -- z
-> [b] -- xs
-> [c]
Prelude Data.List> :t mapAccumL
mapAccumL :: (a -> b -> (a, y)) -> a -> [b] -> (a, [y])
So, another way to write this seems to be
import Data.List
g_scanl3 :: (a -> b -> a) -> (a -> b -> [c]) -> a -> [b] -> [c]
g_scanl3 f g z xs = concat . snd $
mapAccumL (\a b-> (f a b, g a b)) z xs

Zipping with padding in Haskell

A couple of times I've found myself wanting a zip in Haskell that adds padding to the shorter list instead of truncating the longer one. This is easy enough to write. (Monoid works for me here, but you could also just pass in the elements that you want to use for padding.)
zipPad :: (Monoid a, Monoid b) => [a] -> [b] -> [(a, b)]
zipPad xs [] = zip xs (repeat mempty)
zipPad [] ys = zip (repeat mempty) ys
zipPad (x:xs) (y:ys) = (x, y) : zipPad xs ys
This approach gets ugly when trying to define zipPad3. I typed up the following and then realized that of course it doesn't work:
zipPad3 :: (Monoid a, Monoid b, Monoid c) => [a] -> [b] -> [c] -> [(a, b, c)]
zipPad3 xs [] [] = zip3 xs (repeat mempty) (repeat mempty)
zipPad3 [] ys [] = zip3 (repeat mempty) ys (repeat mempty)
zipPad3 [] [] zs = zip3 (repeat mempty) (repeat mempty) zs
zipPad3 xs ys [] = zip3 xs ys (repeat mempty)
zipPad3 xs [] zs = zip3 xs (repeat mempty) zs
zipPad3 [] ys zs = zip3 (repeat mempty) ys zs
zipPad3 (x:xs) (y:ys) (z:zs) = (x, y, z) : zipPad3 xs ys zs
At this point I cheated and just used length to pick the longest list and pad the others.
Am I overlooking a more elegant way to do this, or is something like zipPad3 already defined somewhere?
How about custom head and tail functions (named next and rest in my example below)?
import Data.Monoid
zipPad :: (Monoid a, Monoid b) => [a] -> [b] -> [(a,b)]
zipPad [] [] = []
zipPad xs ys = (next xs, next ys) : zipPad (rest xs) (rest ys)
zipPad3 :: (Monoid a, Monoid b, Monoid c) => [a] -> [b] -> [c] -> [(a,b,c)]
zipPad3 [] [] [] = []
zipPad3 xs ys zs = (next xs, next ys, next zs) : zipPad3 (rest xs) (rest ys) (rest zs)
next :: (Monoid a) => [a] -> a
next [] = mempty
next xs = head xs
rest :: (Monoid a) => [a] -> [a]
rest [] = []
rest xs = tail xs
Test snippet:
instance Monoid Int where
mempty = 0
mappend = (+)
main = do
print $ zipPad [1,2,3,4 :: Int] [1,2 :: Int]
print $ zipPad3 [1,2,3,4 :: Int] [9 :: Int] [1,2 :: Int]
Its output:
[(1,1),(2,2),(3,0),(4,0)]
[(1,9,1),(2,0,2),(3,0,0),(4,0,0)]
This pattern comes up quite a lot. A solution I learned from Paul Chiusano is as follows:
data These a b = This a | That b | These a b
class Align f where
align :: (These a b -> c) -> f a -> f b -> f c
instance Align [] where
align f [] [] = []
align f (x:xs) [] = f (This x) : align f xs []
align f [] (y:ys) = f (That y) : align f [] ys
align f (x:xs) (y:ys) = f (These x y) : align f xs ys
liftAlign2 f a b = align t
where t (This l) = f l b
t (That r) = f a r
t (These l r) = f l r
zipPad a b = liftAlign2 (,) a b
liftAlign3 f a b c xs ys = align t (zipPad a b xs ys)
where t (This (x,y)) = f x y c
t (That r) = f a b r
t (These (x,y) r) = f x y r
zipPad3 a b c = liftAlign3 (,,) a b c
A little test in ghci:
*Main> zipPad3 ["foo", "bar", "baz"] [2, 4, 6, 8] [True, False] "" 0 False
[("foo",2,True),("bar",4,False),("baz",6,False),("",8,False)]
A simpler way to do this is with Maybe. I will illustrate with Edward's
more general formulation:
import Data.Maybe
import Control.Applicative
zipWithTails l r f as bs = catMaybes . takeWhile isJust $
zipWith fMaybe (extend as) (extend bs)
where
extend xs = map Just xs ++ repeat Nothing
fMaybe a b = liftA2 f a b <|> fmap l a <|> fmap r b
There are times when you want to be able to apply a different function to either tail rather than just supply mempty or manual zeroes as well:
zipWithTail :: (a -> a -> a) -> [a] -> [a] -> [a]
zipWithTail f (a:as) (b:bs) = f a b : zipWithTails f as bs
zipWithTail f [] bs = bs
zipWithTail f as _ = as
zipWithTails :: (a -> c) -> (b -> c) -> (a -> b -> c) -> [a] -> [b] -> [c]
zipWithTails l r f (a:as) (b:bs) = f a b : zipWithTails l r f as bs
zipWithTails _ r _ [] bs = fmap r bs
zipWithTails l _ _ as _ = fmap l as
I use the former when I'm doing something like zipWithTail (+)
and the former when I need to do something like zipWithTail (*b) (a*) (\da db -> a*db+b*da) since the former can be much more efficient than feeding a default into a function, and the latter a little bit so.
However, if you just wanted to make a more succinct version of what you have, you could probably turn to mapAccumL ,but its not any clearer, and the ++ can be expensive.
zipPad as bs = done $ mapAccumL go as bs
where go (a:as) b = (as,(a,b))
go [] b = ([],(mempty,b))
done (cs, both) = both ++ fmap (\x -> (x, mempty)) cs

Resources