Multiplying Streams (representing polynomial coefficients) - haskell

For this 2013 homework, I'm trying to multiply 2 Streams.
xStream :: Stream Integer
xStream = Cons 0 (Cons 1 $ streamRepeat 0)
instance Num (Stream Integer) where
fromInteger x = Cons x $ streamRepeat 0
negate = streamMap (* (-1))
(+) xs ys = combineStreams (+) xs ys
(*) xs ys = multStreams xs ys
abs = streamMap abs
Here's the professor's help for how to implement multiplication of the above Stream:
Multiplication is a bit trickier. Suppose A = a0 + xA` and B = b0 +
xB0 are two generating functions we wish to multiply. We reason as
follows: AB = (a0 + xA`)B
= a0B + xA`B
= a0(b0 + xB0) + xA`B
= a0b0 + x(a0B0 + A`B)
Here's my attempt:
multStreams :: Stream Integer -> Stream Integer -> Stream Integer
multStreams (Cons x xs) b#(Cons y ys) = addXY + rest
where addXY = Cons (x + y) $ streamRepeat 0
rest = (xStream *) $ (streamMap (*x) ys + (xs * b))
with the following definitions:
data Stream a = Cons a (Stream a)
streamRepeat :: a -> Stream a
streamRepeat x = Cons x (streamRepeat x)
streamMap :: (a -> b) -> Stream a -> Stream b
streamMap f (Cons x xs) = Cons (f x) rest
where rest = streamMap f xs
combineStreams :: (a -> b -> c) -> Stream a -> Stream b -> Stream c
combineStreams f (Cons x xs) (Cons y ys) = Cons (f x y) rest
where rest = combineStreams f xs ys
Note that xStream is the same as x per this related question.
When I tried the above implementation, my call to multStreams does not terminate.
Please help me understand what's wrong with my above multStream function - both in implementation and whether I even implemented the professor's explanation of multiplication correctly.

The fundamental problem is that your definition of multStreams directly uses (*) on Stream in the definition of rest, which isn't what was intended by the given reasoning.
If you consider the equation AB = a0b0 + x(a0B0 + A'B), it tells you what the first term of AB should be precisely: a0b0 is a constant, i.e. part of the first term, and every other term in the stream is multiplied by x, i.e. not part of the first term.
It also tells you that the remaining terms of the AB come from a0B0 + A'B - because shifting it along by one with a Cons is equivalent to multipltying by x.
The key difference with what you've done is that the first element of the output stream can be constructed without any recursive calls to (*), even though the remaining elements use one.
So something like this should work:
multStreams :: Stream Integer -> Stream Integer -> Stream Integer
multStreams (Cons x xs) b#(Cons y ys) =
Cons (x * y) (streamMap (*x) ys + multStreams xs b)

Related

foldr with 4 arguments?

I am struggling to understand why this code taken from the haskell.org exercise page typechecks (and works as a list reversal function):
myReverse :: [a] -> [a]
myReverse xs = foldr (\x fId empty -> fId (x : empty)) id xs []
My first point of confusion is that foldr accepts 3 arguments, not 4 :
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
so I am guessing that myReverse is equivalent to:
myReverse xs = foldr ((\x fId empty -> fId (x : empty)) id) xs []
but then this should not work either since in the lambda, x is a list element rather than a function ...
Think of it this way. Every function accepts exactly one argument. It may return another function (that accepts one argument). The thing that looks like a multi-argument call
f a b c
is actually parsed as
((f a) b) c
that is, a chain of single-argument function applications. A function type
f :: a -> b -> c -> d
can be decomposed to
f :: a -> (b -> (c -> d))
i.e. a function returning a function returning a function. We usually regard it as a function of three arguments. But can it accept more than three? Yes, if d happens to be another function type.
This is exactly what happens with your fold example. The function that you pass as the first argument to foldr accepts three arguments, which is exactly the same as accepting two arguments and returning another function. Now the (simplified) type of foldr is
(a -> b -> b) -> b -> [a] -> b
but if you look at the first argument of it, you see it's a function of three arguments. Which is, as we have seen, exactly the same as a function that acceora two arguments and returns a function. So the b happens to be a function type. Since b is also the the return tuoe of foldr when applied to three arguments
foldr (\x fId empty -> fId (x : empty)) id
and it's a function, it can now be applied to another argument
(foldr (\x fId empty -> fId (x : empty)) id xs) []
I let you figure out what b actually is.
First of all the variables naming is atrocious. I always use r for the second argument to a foldr's reducer function, as a mnemonic for the "recursive result". "empty" is too overloaded with meaning; it is better to use some neutral name so it is easier to see what it is without any preconceived notions:
myReverse :: [a] -> [a]
myReverse xs = foldr (\x r n -> r (x : n)) id xs []
By virtue of foldr's definition,
foldr f z (x:xs) === f x (foldr f z xs)
i.e.
myReverse [a,b,c,...,z]
= foldr (\x r n -> r (x : n)) id [a,b,c,...,z] []
= (\x r n -> r (x : n)) a (foldr (\x r n -> r (x : n)) id [b,c,...,z]) []
= (\x r n -> r (x : n))
a
(foldr (\x r n -> r (x : n)) id [b,c,...,z])
[]
= let { x = a
; r = foldr (\x r n -> r (x : n)) id [b,c,...,z]
; n = []
}
in r (x : n)
= foldr (\x r n -> r (x : n)) id [b,c,...,z] (a : [])
= foldr (\x r n -> r (x : n)) id [b,c,...,z] [a]
= ....
= foldr (\x r n -> r (x : n)) id [c,...,z] (b : [a])
= foldr (\x r n -> r (x : n)) id [c,...,z] [b,a]
= ....
= foldr (\x r n -> r (x : n)) id [] [z,...,c,b,a]
= id [z,...,c,b,a]
I hope this illustration makes it clearer what is going on there. The extra argument is expected by the reducer function, which is pushed into action by foldr ... resulting in the operational equivalent of
= foldl (\n x -> (x : n)) [] [a,b,c,...,z]
As it turns out, myReverse implementation is using the equivalence
foldl (flip f) n xs === foldr (\x r -> r . f x) id xs n

Haskell - Expressing the Depth First Traversal of a Rose Tree as an instance of unfold, deriving it algebraically

Suppose we have a Rose Tree defined, along with the corresponding fold over the datatype.
data RTree a = Node a [RTree a]
foldRTree :: (a -> [b] -> b) -> RTree a -> b
foldRTree f (Node x xs) = f x (map (foldRTree f) xs)
A recursive definition of a depth first traversal of such a structure would be:
dft :: RTree a -> [a]
dft (Node x xs) = x : concat (map dft xs)
We can express dft as a fold over Rose Trees, and in particular we can derive such a fold algebraically.
// Suppose dft = foldRTree f
// Then foldRTree f (Node x xs) = f x (map (foldRTree f) xs) (definition of foldRTree)
// But also foldRTree f (Node x xs) = dft (Node x xs) (by assumption)
// = x : concat (map dft xs) (definition of dft)
// So we deduce that f x (map (foldRTree f) xs) = x : concat (map dft xs)
// Hence f x (map dft xs) = x : concat (map dft xs) (by assumption)
// So we now see that f x y = x : concat y
I suppose the reason we can do this is because foldRTree captures the general recursion structure over RTrees which brings me to my query about unfold.
We define unfold as follows:
unfold :: (a -> Bool) -> (a -> b) -> (a -> a) -> a -> [b]
unfold n h t x | n x = []
| otherwise = h x : unfold n h t (t x)
// Or Equivalently
unfold' n h t = map h . takeWhile (not.n) . iterate t
We can express the depth first traversal as an unfold as follows:
dft (Node x xs) = x : unfold null h t xs
where h ((Node a xs) : ys) = a
t ((Node a xs) : ys) = xs ++ ys
I am struggling to find a way to develop a way of algebraically calculating the functions n h t in the same way as cons. In particular there is a ingenious step in developing the unfold which is to realise that the final argument to unfold needs to be of type [RTree a] and not just RTree a. Therefore the argument posed to dft is not passed straight to the unfold and so we reach a hurdle with regards to reasoning about these two functions.
I would be extremely grateful to anyone who could provide a mathematical way of reasoning about unfold in such a way to calculate the required functions n h, and t when expressing a recursive function (that is naturally a fold) as an unfold (perhaps using some laws linking fold and unfold?). A natural question would then be what methods we have to prove such a relation correct.

Writing foldl using foldr

In Real World Haskell, Chapter 4. on Functional Programming:
Write foldl with foldr:
-- file: ch04/Fold.hs
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
The above code confused me a lot, and somebody called dps rewrote it with a meaningful name to make it a bit clearer:
myFoldl stepL zeroL xs = (foldr stepR id xs) zeroL
where stepR lastL accR accInitL = accR (stepL accInitL lastL)
Somebody else, Jef G, then did an excellent job by providing an example and showing the underlying mechanism step by step:
myFoldl (+) 0 [1, 2, 3]
= (foldR step id [1, 2, 3]) 0
= (step 1 (step 2 (step 3 id))) 0
= (step 1 (step 2 (\a3 -> id ((+) a3 3)))) 0
= (step 1 (\a2 -> (\a3 -> id ((+) a3 3)) ((+) a2 2))) 0
= (\a1 -> (\a2 -> (\a3 -> id ((+) a3 3)) ((+) a2 2)) ((+) a1 1)) 0
= (\a1 -> (\a2 -> (\a3 -> (+) a3 3) ((+) a2 2)) ((+) a1 1)) 0
= (\a1 -> (\a2 -> (+) ((+) a2 2) 3) ((+) a1 1)) 0
= (\a1 -> (+) ((+) ((+) a1 1) 2) 3) 0
= (+) ((+) ((+) 0 1) 2) 3
= ((0 + 1) + 2) + 3
But I still cannot fully understand that, here are my questions:
What is the id function for? What is the role of? Why should we need it here?
In the above example, id function is the accumulator in the lambda function?
foldr's prototype is foldr :: (a -> b -> b) -> b -> [a] -> b, and the first parameter is a function which need two parameters, but the step function in the myFoldl's implementation uses 3 parameters, I'm complelely confused!
Some explanations are in order!
What is the id function for? What is the role of? Why should we need it here?
id is the identity function, id x = x, and is used as the equivalent of zero when building up a chain of functions with function composition, (.). You can find it defined in the Prelude.
In the above example, id function is the accumulator in the lambda function?
The accumulator is a function that is being built up via repeated function application. There's no explicit lambda, since we name the accumulator, step. You can write it with a lambda if you want:
foldl f a bs = foldr (\b g x -> g (f x b)) id bs a
Or as Graham Hutton would write:
5.1 The foldl operator
Now let us generalise from the suml example and consider the standard operator foldl that processes the elements of a list in left-to-right order by using a function f to combine values, and a value v as the starting value:
foldl :: (β → α → β) → β → ([α] → β)
foldl f v [ ] = v
foldl f v (x : xs) = foldl f (f v x) xs
Using this operator, suml can be redefined simply by suml = foldl (+) 0. Many other functions can be defined in a simple way using foldl. For example, the standard function reverse can redefined using foldl as follows:
reverse :: [α] → [α]
reverse = foldl (λxs x → x : xs) [ ]
This definition is more efficient than our original definition using fold, because it avoids the use of the inefficient append operator (++) for lists.
A simple generalisation of the calculation in the previous section for the function suml shows how to redefine the function foldl in terms of fold:
foldl f v xs = fold (λx g → (λa → g (f a x))) id xs v
In contrast, it is not possible to redefine fold in terms of foldl, due to the fact that
foldl is strict in the tail of its list argument but fold is not. There are a number of useful ‘duality theorems’ concerning fold and foldl, and also some guidelines for deciding which operator is best suited to particular applications (Bird, 1998).
foldr's prototype is foldr :: (a -> b -> b) -> b -> [a] -> b
A Haskell programmer would say that the type of foldr is (a -> b -> b) -> b -> [a] -> b.
and the first parameter is a function which need two parameters, but the step function in the myFoldl's implementation uses 3 parameters, I'm complelely confused
This is confusing and magical! We play a trick and replace the accumulator with a function, which is in turn applied to the initial value to yield a result.
Graham Hutton explains the trick to turn foldl into foldr in the above article. We start by writing down a recursive definition of foldl:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v [] = v
foldl f v (x : xs) = foldl f (f v x) xs
And then refactor it via the static argument transformation on f:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v xs = g xs v
where
g [] v = v
g (x:xs) v = g xs (f v x)
Let's now rewrite g so as to float the v inwards:
foldl f v xs = g xs v
where
g [] = \v -> v
g (x:xs) = \v -> g xs (f v x)
Which is the same as thinking of g as a function of one argument, that returns a function:
foldl f v xs = g xs v
where
g [] = id
g (x:xs) = \v -> g xs (f v x)
Now we have g, a function that recursively walks a list, apply some function f. The final value is the identity function, and each step results in a function as well.
But, we have handy already a very similar recursive function on lists, foldr!
2 The fold operator
The fold operator has its origins in recursion theory (Kleene, 1952), while the use
of fold as a central concept in a programming language dates back to the reduction operator of APL (Iverson, 1962), and later to the insertion operator of FP (Backus,
1978). In Haskell, the fold operator for lists can be defined as follows:
fold :: (α → β → β) → β → ([α] → β)
fold f v [ ] = v
fold f v (x : xs) = f x (fold f v xs)
That is, given a function f of type α → β → β and a value v of type β, the function
fold f v processes a list of type [α] to give a value of type β by replacing the nil
constructor [] at the end of the list by the value v, and each cons constructor (:) within the list by the function f. In this manner, the fold operator encapsulates a simple pattern of recursion for processing lists, in which the two constructors for lists are simply replaced by other values and functions. A number of familiar functions on lists have a simple definition using fold.
This looks like a very similar recursive scheme to our g function. Now the trick: using all the available magic at hand (aka Bird, Meertens and Malcolm) we apply a special rule, the universal property of fold, which is an equivalence between two definitions for a function g that processes lists, stated as:
g [] = v
g (x:xs) = f x (g xs)
if and only if
g = fold f v
So, the universal property of folds states that:
g = foldr k v
where g must be equivalent to the two equations, for some k and v:
g [] = v
g (x:xs) = k x (g xs)
From our earlier foldl designs, we know v == id. For the second equation though, we need
to calculate the definition of k:
g (x:xs) = k x (g xs)
<=> g (x:xs) v = k x (g xs) v -- accumulator of functions
<=> g xs (f v x) = k x (g xs) v -- definition of foldl
<= g' (f v x) = k x g' v -- generalize (g xs) to g'
<=> k = \x g' -> (\a -> g' (f v x)) -- expand k. recursion captured in g'
Which, substituting our calculated definitions of k and v yields a
definition of foldl as:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v xs =
foldr
(\x g -> (\a -> g (f v x)))
id
xs
v
The recursive g is replaced with the foldr combinator, and the accumulator becomes a function built via a chain of compositions of f at each element of the list, in reverse order (so we fold left instead of right).
This is definitely somewhat advanced, so to deeply understand this transformation, the universal property of folds, that makes the transformation possible, I recommend Hutton's tutorial, linked below.
References
Haskell Wiki: Foldl as foldr
A tutorial on the universality and expressiveness of fold, Graham Hutton, J. Functional Programming 9 (4): 355–372, July 1999.
Malcolm, G. Algebraic data types and program transformation., PhD thesis, Groningen University.
Consider the type of foldr:
foldr :: (b -> a -> a) -> a -> [b] -> a
Whereas the type of step is something like b -> (a -> a) -> a -> a. Since step is getting passed to foldr, we can conclude that in this case the fold has a type like (b -> (a -> a) -> (a -> a)) -> (a -> a) -> [b] -> (a -> a).
Don't be confused by the different meanings of a in different signatures; it's just a type variable. Also, keep in mind that the function arrow is right associative, so a -> b -> c is the same thing as a -> (b -> c).
So, yes, the accumulator value for the foldr is a function of type a -> a, and the initial value is id. This makes some sense, because id is a function that doesn't do anything--it's the same reason you'd start with zero as the initial value when adding all the values in a list.
As for step taking three arguments, try rewriting it like this:
step :: b -> (a -> a) -> (a -> a)
step x g = \a -> g (f a x)
Does that make it easier to see what's going on? It takes an extra parameter because it's returning a function, and the two ways of writing it are equivalent. Note also the extra parameter after the foldr: (foldr step id xs) z. The part in parentheses is the fold itself, which returns a function, which is then applied to z.
(quickly skim through my answers [1], [2], [3], [4] to make sure you understand Haskell's syntax, higher-order functions, currying, function composition, $ operator, infix/prefix operators, sections and lambdas)
Universal property of fold
A fold is just a codification of certain kinds of recursion. And universality property simply states that, if your recursion conforms to a certain form, it can be transformed into fold according to some formal rules. And conversely, every fold can be transformed into a recursion of that kind. Once again, some recursions can be translated into folds that give exactly the same answer, and some recursions can't, and there is an exact procedure to do that.
Basically, if your recursive function works on lists an looks like on the left, you can transform it to fold one the right, substituting f and v for what actually is there.
g [] = v ⇒
g (x:xs) = f x (g xs) ⇒ g = foldr f v
For example:
sum [] = 0 {- recursion becomes fold -}
sum (x:xs) = x + sum xs ⇒ sum = foldr 0 (+)
Here v = 0 and sum (x:xs) = x + sum xs is equivalent to sum (x:xs) = (+) x (sum xs), therefore f = (+). 2 more examples
product [] = 1
product (x:xs) = x * product xs ⇒ product = foldr 1 (*)
length [] = 0
length (x:xs) = 1 + length xs ⇒ length = foldr (\_ a -> 1 + a) 0
Exercise:
Implement map, filter, reverse, concat and concatMap recursively, just like the above functions on the left side.
Convert these 5 functions to foldr according to a formula above, that is, substituting f and v in the fold formula on the right.
Foldl via foldr
How to write a recursive function that sums numbers up from left to right?
sum [] = 0 -- given `sum [1,2,3]` expands into `(1 + (2 + 3))`
sum (x:xs) = x + sum xs
The first recursive function that comes to find fully expands before even starts adding up, that's not what we need. One approach is to create a recursive function that has accumulator, that immediately adds up numbers on each step (read about tail recursion to learn more about recursion strategies):
suml :: [a] -> a
suml xs = suml' xs 0
where suml' [] n = n -- auxiliary function
suml' (x:xs) n = suml' xs (n+x)
Alright, stop! Run this code in GHCi and make you sure you understand how it works, then carefully and thoughtfully proceed. suml can't be redefined with a fold, but suml' can be.
suml' [] = v -- equivalent: v n = n
suml' (x:xs) n = f x (suml' xs) n
suml' [] n = n from function definition, right? And v = suml' [] from the universal property formula. Together this gives v n = n, a function that immediately returns whatever it receives: v = id. Let's calculate f:
suml' (x:xs) n = f x (suml' xs) n
-- expand suml' definition
suml' xs (n+x) = f x (suml' xs) n
-- replace `suml' xs` with `g`
g (n+x) = f x g n
Thus, suml' = foldr (\x g n -> g (n+x)) id and, thus, suml = foldr (\x g n -> g (n+x)) id xs 0.
foldr (\x g n -> g (n + x)) id [1..10] 0 -- return 55
Now we just need to generalize, replace + by a variable function:
foldl f a xs = foldr (\x g n -> g (n `f` x)) id xs a
foldl (-) 10 [1..5] -- returns -5
Conclusion
Now read Graham Hutton's A tutorial on the universality and expressiveness of fold. Get some pen and paper, try to figure everything that he writes until you get derive most of the folds by yourself. Don't sweat if you don't understand something, you can always return later, but don't procrastinate much either.
Here's my proof that foldl can be expressed in terms of foldr, which I find pretty simple apart from the name spaghetti the step function introduces.
The proposition is that foldl f z xs is equivalent to
myfoldl f z xs = foldr step_f id xs z
where step_f x g a = g (f a x)
The first important thing to notice here is that the right hand side of the first line is actually evaluated as
(foldr step_f id xs) z
since foldr only takes three parameters. This already hints that the foldr will calculate not a value but a curried function, which is then applied to z. There are two cases to investigate to find out whether myfoldl is foldl:
Base case: empty list
myfoldl f z []
= foldr step_f id [] z (by definition of myfoldl)
= id z (by definition of foldr)
= z
foldl f z []
= z (by definition of foldl)
Non-empty list
myfoldl f z (x:xs)
= foldr step_f id (x:xs) z (by definition of myfoldl)
= step_f x (foldr step_f id xs) z (-> apply step_f)
= (foldr step_f id xs) (f z x) (-> remove parentheses)
= foldr step_f id xs (f z x)
= myfoldl f (f z x) xs (definition of myfoldl)
foldl f z (x:xs)
= foldl f (f z x) xs
Since in 2. the first and the last line have the same form in both cases, it can be used to fold the list down until xs == [], in which case 1. guarantees the same result. So by induction, myfoldl == foldl.
There is no Royal Road to Mathematics, nor even through Haskell. Let
h z = (foldr step id xs) z where
step x g = \a -> g (f a x)
What the heck is h z? Assume that xs = [x0, x1, x2].
Apply the definition of foldr:
h z = (step x0 (step x1 (step x2 id))) z
Apply the definition of step:
= (\a0 -> (\a1 -> (\a2 -> id (f a2 x2)) (f a1 x1)) (f a0 x0)) z
Substitute into the lambda functions:
= (\a1 -> (\a2 -> id (f a2 x2)) (f a1 x1)) (f z x0)
= (\a2 -> id (f a2 x2)) (f (f z x0) x1)
= id (f (f (f z x0) x1) x2)
Apply definition of id :
= f (f (f z x0) x1) x2
Apply definition of foldl :
= foldl f z [x0, x1, x2]
Is it a Royal Road or what?
I'm posting the answer for those people who might find this approach better suited to their way of thinking. The answer possibly contains redundant information and thoughts, but it is what I needed in order to tackle the problem. Furthermore, since this is yet another answer to the same question, it's obvious that it has substantial overlaps with the other answers, however it tells the tale of how I could grasp this concept.
Indeed I started to write down this notes as a personal record of my thoughts while trying to understand this topic. It took all the day for me to touch the core of it, if I really have got it.
My long way to understanding this simple exercise
Easy part: what do we need to determine?
What happens with the following example call
foldl f z [1,2,3,4]
can be visualized with the following diagram (which is on Wikipedia, but I first saw it on another answer):
_____results in a number
/
f f (f (f (f z 1) 2) 3) 4
/ \
f 4 f (f (f z 1) 2) 3
/ \
f 3 f (f z 1) 2
/ \
f 2 f z 1
/ \
z 1
(As a side note, when using foldl each applications of f is not performed, and the expressions are thunked just the way I wrote them above; in principle, they could be computed as you go bottom-top, and that's exactly what foldl' does.)
The exercise essentially challenges us to use foldr instead of foldl by appropriately changing the step function (so we use s instead of f) and the initial accumulator (so we use ? instead of z); the list stays the same, otherwise what are we talking about?
The call to foldr has to look like this:
foldr s ? [1,2,3,4]
and the corresponding diagram is this:
_____what does the last call return?
/
s
/ \
1 s
/ \
2 s
/ \
3 s
/ \
4 ? <--- what is the initial accumulator?
The call results in
s 1 (s 2 (s 3 (s 4 ?)))
What are s and ?? And what are their types? It looks like s it's a two argument function, much like f, but let's not jump to conclusions. Also, let's leave ? aside for a moment, and let's observe that z has to come into play as soon as 1 comes into play; however, how can z come into play in the call to the maybe-two-argument s function, namely in the call s 1 (…)? We can solve this part of the enigma by choosing an s which takes 3 arguments, rather than the 2 we mentioned earlier, so that the outermost call s 1 (…) will result in a function taking one argument, which we can pass z to!
This means that we want the original call, which expands to
f (f (f (f z 1) 2) 3) 4
to be equivalent to
s 1 (s 2 (s 3 (s 4 ?))) z
or, in other words, we want the partially applied function
s 1 (s 2 (s 3 (s 4 ?)))
to be equivalent to the following lambda function
(\z -> f (f (f (f z 1) 2) 3) 4)
Again, the "only" pieces we need are s and ?.
Turning point: recognize function composition
Let's redraw the previous diagram and write on the right what we want each call to s be equivalent to:
s s 1 (…) == (\z -> f (f (f (f z 1) 2) 3) 4)
/ \
1 s s 2 (…) == (\z -> f (f (f z 2) 3) 4)
/ \
2 s s 3 (…) == (\z -> f (f z 3) 4)
/ \
3 s s 4 ? == (\z -> f z 4)
/ \
4 ? <--- what is the initial accumulator?
I hope it's clear from the structure of the diagram that the (…) on each line is the right hand side of the line below it; better, it is the function returned from the previous (below) call to s.
It should be also clear that a call to s with arguments x and y is the (full) application of y to the partial application of f to the only argument x (as its second argument). Since the partial application of f to x can be written as the lambda (\z -> f z x), fully applying y to it results in the lambda (\z -> y (f z x)), which in this case I would rewrite as y . (\z -> f z x); translating the words into an expression for s we get
s x y = y . (\z -> f z x)
(This is the same as s x y z = y (f z x), which is the same as the book, if you rename the variables.)
The last bit is: what is the initial "value" ? of the accumulator? The above diagram can be rewritten by expanding the nested calls to make them composition chains:
s s 1 (…) == (\z -> f z 4) . (\z -> f z 3) . (\z -> f z 2) . (\z -> f z 1)
/ \
1 s s 2 (…) == (\z -> f z 4) . (\z -> f z 3) . (\z -> f z 2)
/ \
2 s s 3 (…) == (\z -> f z 4) . (\z -> f z 3)
/ \
3 s s 4 ? == (\z -> f z 4)
/ \
4 ? <--- what is the initial accumulator?
We here see that s simply "piles up" successive partial applications of f, but the y in s x y = y . (\z -> f z x) suggests that the interpretation of s 4 ? (and, in turn, all the others) misses a leading function to be composed with the leftmost lambda.
That's just our ? function: it's time to give it a reason for its existence, beside occupying a place in the call to foldr. What can we choose it to be, in order not to change the resulting functions? Answer: id, the identity function, which is also the identity element with respect to the composition operator (.).
s s 1 (…) == id . (\z -> f z 4) . (\z -> f z 3) . (\z -> f z 2) . (\z -> f z 1)
/ \
1 s s 2 (…) == id . (\z -> f z 4) . (\z -> f z 3) . (\z -> f z 2)
/ \
2 s s 3 (…) == id . (\z -> f z 4) . (\z -> f z 3)
/ \
3 s s 4 id == id . (\z -> f z 4)
/ \
4 id
So the sought function is
myFoldl f z xs = foldr (\x g a -> g (f a x)) id xs z
foldr step zero (x:xs) = step x (foldr step zero xs)
foldr _ zero [] = zero
myFold f z xs = foldr step id xs z
where step x g a = g (f a x)
myFold (+) 0 [1, 2, 3] =
foldr step id [1, 2, 3] 0
-- Expanding foldr function
step 1 (foldr step id [2, 3]) 0
step 1 (step 2 (foldr step id [3])) 0
step 1 (step 2 (step 3 (foldr step id []))) 0
-- Expanding step function if it is possible
step 1 (step 2 (step 3 id)) 0
step 2 (step 3 id) (0 + 1)
step 3 id ((0 + 1) + 2)
id (((0 + 1) + 2) + 3)
Well, at least, this helped me. Even it is not quite right.
This might help, I tried expanding in a different way.
myFoldl (+) 0 [1,2,3] =
foldr step id [1,2,3] 0 =
foldr step (\a -> id (a+3)) [1,2] 0 =
foldr step (\b -> (\a -> id (a+3)) (b+2)) [1] 0 =
foldr step (\b -> id ((b+2)+3)) [1] 0 =
foldr step (\c -> (\b -> id ((b+2)+3)) (c+1)) [] 0 =
foldr step (\c -> id (((c+1)+2)+3)) [] 0 =
(\c -> id (((c+1)+2)+3)) 0 = ...
This answer makes the definition below easily understood in three step.
-- file: ch04/Fold.hs
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
Step 1. transform the fold of function evaluation to function combination
foldl f z [x1 .. xn] = z & f1 & .. & fn = fn . .. . f1 z. in which fi = \z -> f z xi.
(By using z & f1 & f2 & .. & fn it means fn ( .. (f2 (f1 z)) .. ).)
Step 2. express the function combination in a foldr manner
foldr (.) id [f1 .. fn] = (.) f1 (foldr (.) id [f2 .. fn]) = f1 . (foldr (.) id [f2 .. fn]). Unfold the rest to get foldr (.) id [f1 .. fn] = f1 . .. . fn.
Noticing that the sequence is reversed, we should use the reversed form of (.). Define rc f1 f2 = (.) f2 f1 = f2 . f1, then foldr rc id [f1 .. fn] = rc f1 (foldr (.) id [f2 .. fn]) = (foldr (.) id [f2 .. fn]) . f1. Unfold the rest to get foldr rc id [f1 .. fn] = fn . .. . f1.
Step 3. transform the fold on function list to the fold on operand list
Find step that makes foldr step id [x1 .. xn] = foldr rc id [f1 .. fn]. It is easy to find step = \x g z -> g (f z x).
In 3 steps, the definition of foldl using foldr is clear:
foldl f z xs
= fn . .. . f1 z
= foldr rc id fs z
= foldr step id xs z
Prove the correctness:
foldl f z xs = foldr (\x g z -> g (f z x)) id xs z
= step x1 (foldr step id [x2 .. xn]) z
= s1 (foldr step id [x2 .. xn]) z
= s1 (step x2 (foldr step id [x3 .. xn])) z
= s1 (s2 (foldr step id [x3 .. xn])) z
= ..
= s1 (s2 (.. (sn (foldr step id [])) .. )) z
= s1 (s2 (.. (sn id) .. )) z
= (s2 (.. (sn id) .. )) (f z x1)
= s2 (s3 (.. (sn id) .. )) (f z x1)
= (s3 (.. (sn id) .. )) (f (f z x1) x2)
= ..
= sn id (f (.. (f (f z x1) x2) .. ) xn-1)
= id (f (.. (f (f z x1) x2) .. ) xn)
= f (.. (f (f z x1) x2) .. ) xn
in which xs = [x1 .. xn], si = step xi = \g z -> g (f z xi)
If you find anything to be unclear, please add a comment. :)

Implement zip using foldr

I'm currently on chapter 4 of Real World Haskell, and I'm trying to wrap my head around implementing foldl in terms of foldr.
(Here's their code:)
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
I thought I'd try to implement zip using the same technique, but I don't seem to be making any progress. Is it even possible?
zip2 xs ys = foldr step done xs ys
where done ys = []
step x zipsfn [] = []
step x zipsfn (y:ys) = (x, y) : (zipsfn ys)
How this works: (foldr step done xs) returns a function that consumes
ys; so we go down the xs list building up a nested composition of
functions that will each be applied to the corresponding part of ys.
How to come up with it: I started with the general idea (from similar
examples seen before), wrote
zip2 xs ys = foldr step done xs ys
then filled in each of the following lines in turn with what it had to
be to make the types and values come out right. It was easiest to
consider the simplest cases first before the harder ones.
The first line could be written more simply as
zip2 = foldr step done
as mattiast showed.
The answer had already been given here, but not an (illustrative) derivation. So even after all these years, perhaps it's worth adding it.
It is actually quite simple. First,
foldr f z xs
= foldr f z [x1,x2,x3,...,xn] = f x1 (foldr f z [x2,x3,...,xn])
= ... = f x1 (f x2 (f x3 (... (f xn z) ...)))
hence by eta-expansion,
foldr f z xs ys
= foldr f z [x1,x2,x3,...,xn] ys = f x1 (foldr f z [x2,x3,...,xn]) ys
= ... = f x1 (f x2 (f x3 (... (f xn z) ...))) ys
As is apparent here, if f is non-forcing in its 2nd argument, it gets to work first on x1 and ys, f x1r1ys where r1 =(f x2 (f x3 (... (f xn z) ...)))= foldr f z [x2,x3,...,xn].
So, using
f x1 r1 [] = []
f x1 r1 (y1:ys1) = (x1,y1) : r1 ys1
we arrange for passage of information left-to-right along the list, by calling r1 with the rest of the input list ys1, foldr f z [x2,x3,...,xn]ys1 = f x2r2ys1, as the next step. And that's that.
When ys is shorter than xs (or the same length), the [] case for f fires and the processing stops. But if ys is longer than xs then f's [] case won't fire and we'll get to the final f xnz(yn:ysn) application,
f xn z (yn:ysn) = (xn,yn) : z ysn
Since we've reached the end of xs, the zip processing must stop:
z _ = []
And this means the definition z = const [] should be used:
zip xs ys = foldr f (const []) xs ys
where
f x r [] = []
f x r (y:ys) = (x,y) : r ys
From the standpoint of f, r plays the role of a success continuation, which f calls when the processing is to continue, after having emitted the pair (x,y).
So r is "what is done with more ys when there are more xs", and z = const [], the nil-case in foldr, is "what is done with ys when there are no more xs". Or f can stop by itself, returning [] when ys is exhausted.
Notice how ys is used as a kind of accumulating value, which is passed from left to right along the list xs, from one invocation of f to the next ("accumulating" step being, here, stripping a head element from it).
Naturally this corresponds to the left fold, where an accumulating step is "applying the function", with z = id returning the final accumulated value when "there are no more xs":
foldl f a xs =~ foldr (\x r a-> r (f a x)) id xs a
Similarly, for finite lists,
foldr f a xs =~ foldl (\r x a-> r (f x a)) id xs a
And since the combining function gets to decide whether to continue or not, it is now possible to have left fold that can stop early:
foldlWhile t f a xs = foldr cons id xs a
where
cons x r a = if t x then r (f a x) else a
or a skipping left fold, foldlWhen t ..., with
cons x r a = if t x then r (f a x) else r a
etc.
I found a way using quite similar method to yours:
myzip = foldr step (const []) :: [a] -> [b] -> [(a,b)]
where step a f (b:bs) = (a,b):(f bs)
step a f [] = []
For the non-native Haskellers here, I've written a Scheme version of this algorithm to make it clearer what's actually happening:
> (define (zip lista listb)
((foldr (lambda (el func)
(lambda (a)
(if (empty? a)
empty
(cons (cons el (first a)) (func (rest a))))))
(lambda (a) empty)
lista) listb))
> (zip '(1 2 3 4) '(5 6 7 8))
(list (cons 1 5) (cons 2 6) (cons 3 7) (cons 4 8))
The foldr results in a function which, when applied to a list, will return the zip of the list folded over with the list given to the function. The Haskell hides the inner lambda because of lazy evaluation.
To break it down further:
Take zip on input: '(1 2 3)
The foldr func gets called with
el->3, func->(lambda (a) empty)
This expands to:
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a))))
If we were to return this now, we'd have a function which takes a list of one element
and returns the pair (3 element):
> (define f (lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a)))))
> (f (list 9))
(list (cons 3 9))
Continuing, foldr now calls func with
el->3, func->f ;using f for shorthand
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
This is a func which takes a list with two elements, now, and zips them with (list 2 3):
> (define g (lambda (a) (cons (cons 2 (first a)) (f (rest a)))))
> (g (list 9 1))
(list (cons 2 9) (cons 3 1))
What's happening?
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
a, in this case, is (list 9 1)
(cons (cons 2 (first (list 9 1))) (f (rest (list 9 1))))
(cons (cons 2 9) (f (list 1)))
And, as you recall, f zips its argument with 3.
And this continues etc...
The problem with all these solutions for zip is that they only fold over one list or the other, which can be a problem if both of them are "good producers", in the parlance of list fusion. What you actually need is a solution that folds over both lists. Fortunately, there is a paper about exactly that, called "Coroutining Folds with Hyperfunctions".
You need an auxiliary type, a hyperfunction, which is basically a function that takes another hyperfunction as its argument.
newtype H a b = H { invoke :: H b a -> b }
The hyperfunctions used here basically act like a "stack" of ordinary functions.
push :: (a -> b) -> H a b -> H a b
push f q = H $ \k -> f $ invoke k q
You also need a way to put two hyperfunctions together, end to end.
(.#.) :: H b c -> H a b -> H a c
f .#. g = H $ \k -> invoke f $ g .#. k
This is related to push by the law:
(push f x) .#. (push g y) = push (f . g) (x .#. y)
This turns out to be an associative operator, and this is the identity:
self :: H a a
self = H $ \k -> invoke k self
You also need something that disregards everything else on the "stack" and returns a specific value:
base :: b -> H a b
base b = H $ const b
And finally, you need a way to get a value out of a hyperfunction:
run :: H a a -> a
run q = invoke q self
run strings all of the pushed functions together, end to end, until it hits a base or loops infinitely.
So now you can fold both lists into hyperfunctions, using functions that pass information from one to the other, and assemble the final value.
zip xs ys = run $ foldr (\x h -> push (first x) h) (base []) xs .#. foldr (\y h -> push (second y) h) (base Nothing) ys where
first _ Nothing = []
first x (Just (y, xys)) = (x, y):xys
second y xys = Just (y, xys)
The reason why folding over both lists matters is because of something GHC does called list fusion, which is talked about in the GHC.Base module, but probably should be much more well-known. Being a good list producer and using build with foldr can prevent lots of useless production and immediate consumption of list elements, and can expose further optimizations.
I tried to understand this elegant solution myself, so I tried to derive the types and evaluation myself. So, we need to write a function:
zip xs ys = foldr step done xs ys
Here we need to derive step and done, whatever they are. Recall foldr's type, instantiated to lists:
foldr :: (a -> state -> state) -> state -> [a] -> state
However our foldr invocation must be instantiated to something like below, because we must accept not one, but two list arguments:
foldr :: (a -> ? -> ?) -> ? -> [a] -> [b] -> [(a,b)]
Because -> is right-associative, this is equivalent to:
foldr :: (a -> ? -> ?) -> ? -> [a] -> ([b] -> [(a,b)])
Our ([b] -> [(a,b)]) corresponds to state type variable in the original foldr type signature, therefore we must replace every occurrence of state with it:
foldr :: (a -> ([b] -> [(a,b)]) -> ([b] -> [(a,b)]))
-> ([b] -> [(a,b)])
-> [a]
-> ([b] -> [(a,b)])
This means that arguments that we pass to foldr must have the following types:
step :: a -> ([b] -> [(a,b)]) -> [b] -> [(a,b)]
done :: [b] -> [(a,b)]
xs :: [a]
ys :: [b]
Recall that foldr (+) 0 [1,2,3] expands to:
1 + (2 + (3 + 0))
Therefore if xs = [1,2,3] and ys = [4,5,6,7], our foldr invocation would expand to:
1 `step` (2 `step` (3 `step` done)) $ [4,5,6,7]
This means that our 1 `step` (2 `step` (3 `step` done)) construct must create a recursive function that would go through [4,5,6,7] and zip up the elements. (Keep in mind, that if one of the original lists is longer, the excess values are thrown away). IOW, our construct must have the type [b] -> [(a,b)].
3 `step` done is our base case, where done is an initial value, like 0 in foldr (+) 0 [1..3]. We don't want to zip anything after 3, because 3 is the final value of xs, so we must terminate the recursion. How do you terminate the recursion over list in the base case? You return empty list []. But recall done type signature:
done :: [b] -> [(a,b)]
Therefore we can't return just [], we must return a function that would ignore whatever it receives. Therefore use const:
done = const [] -- this is equivalent to done = \_ -> []
Now let's start figuring out what step should be. It combines a value of type a with a function of type [b] -> [(a,b)] and returns a function of type [b] -> [(a,b)].
In 3 `step` done, we know that the result value that would later go to our zipped list must be (3,6) (knowing from original xs and ys). Therefore 3 `step` done must evaluate into:
\(y:ys) -> (3,y) : done ys
Remember, we must return a function, inside which we somehow zip up the elements, the above code is what makes sense and typechecks.
Now that we assumed how exactly step should evaluate, let's continue the evaluation. Here's how all reduction steps in our foldr evaluation look like:
3 `step` done -- becomes
(\(y:ys) -> (3,y) : done ys)
2 `step` (\(y:ys) -> (3,y) : done ys) -- becomes
(\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys)
1 `step` (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) -- becomes
(\(y:ys) -> (1,y) : (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) ys)
The evaluation gives rise to this implementation of step (note that we account for ys running out of elements early by returning an empty list):
step x f = \[] -> []
step x f = \(y:ys) -> (x,y) : f ys
Thus, the full function zip is implemented as follows:
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr step done xs ys
where done = const []
step x f [] = []
step x f (y:ys) = (x,y) : f ys
P.S.: If you are inspired by elegance of folds, read Writing foldl using foldr and then Graham Hutton's A tutorial on the universality and expressiveness of fold.
A simple approach:
lZip, rZip :: Foldable t => [b] -> t a -> [(a, b)]
-- implement zip using fold?
lZip xs ys = reverse.fst $ foldl f ([],xs) ys
where f (zs, (y:ys)) x = ((x,y):zs, ys)
-- Or;
rZip xs ys = fst $ foldr f ([],reverse xs) ys
where f x (zs, (y:ys)) = ((x,y):zs, ys)

Where do theses values come from in this haskell function?

Let's say I have the following function:
sumAll :: [(Int,Int)] -> Int
sumAll xs = foldr (+) 0 (map f xs)
where f (x,y) = x+y
The result of sumAll [(1,1),(2,2),(3,3)] will be 12.
What I don't understand is where the (x,y) values are coming from. Well, I know they come from the xs variable but I don't understand how. I mean, doing the code above directly without the where keyword, it would be something like this:
sumAll xs = foldr (+) 0 (map (\(x,y) -> x+y) xs)
And I can't understand, in the top code, how does the f variable and (x,y) variables represent the (\(x,y) -> x+y) lambda expression.
Hopefully this will help. The key is that f is applied to the elements of the list, which are pairs.
sumAll [(1,1),(2,2),(3,3)]
-- definition of sumAll
= foldr (+) 0 (map f [(1,1),(2,2),(3,3)])
-- application of map
= foldr (+) 0 (f (1,1) : map f [(2,2),(3,3)])
-- application of foldr
= 0 + foldr (+) (f (1,1)) (map f [(2,2),(3,3)])
-- application of map
= 0 + foldr (+) (f (1,1)) (f (2,2) : map f [(3,3)])
-- application of foldr
= 0 + (f (1,1) + foldr (+) (f (2,2)) (map f [(3,3)]))
-- application of f
= 0 + (2 + foldr (+) (f (2,2)) (map f [(3,3)]))
-- application of map
= 0 + (2 + foldr (+) (f (2,2)) (f (3,3) : map f []))
-- application of foldr
= 0 + (2 + (f (2,2) + foldr (+) (f (3,3)) (map f [])))
-- application of f
= 0 + (2 + (4 + foldr (+) (f (3,3)) (map f [])))
-- application of map
= 0 + (2 + (4 + foldr (+) (f (3,3)) []))
-- application of foldr
= 0 + (2 + (4 + f (3,3)))
-- application of f
= 0 + (2 + (4 + 6))
= 0 + (2 + 10)
= 0 + 12
= 12
In Haskell, functions are first class datatypes.
This means you can pass functions around like other types of data such as integers and strings.
In your code above you declare 'f' to be a function, which takes in one argumenta (a tuple of two values (x,y)) and returns the result of (x + y).
foldr is another function which takes in 3 arguments, a binary function (in this case +) a starting value (0) and an array of values to iterator over.
In short 'where f (x,y) = x + y' is just scoped shorthand for
sumAll :: [(Int,Int)] -> Int
sumAll xs = foldr (+) 0 (map myFunctionF xs)
myFunctionF :: (Int,Int) -> Int
myFunctionF (x,y) = x + y
Edit: If your unsure about how foldr works, check out Haskell Reference Zvon
Below is an example implementation of foldl / map.
foldl :: (a -> b -> b) -> b -> [a] -> b
foldl _ x [] = x
foldl fx (y:ys) = foldl f (f y x) ys
map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = (f x) : (map f xs)
Not an answer, but I thought I should point out that your function f:
f (x, y) = x + y
can be expressed as
f = uncurry (+)

Resources