Haskell, Foldr, and foldl - haskell

I've been trying to wrap my head around foldr and foldl for quite some time, and I've decided the following question should settle it for me. Suppose you pass the following list [1,2,3] into the following four functions:
a = foldl (\xs y -> 10*xs -y) 0
b = foldl (\xs y -> y - 10 * xs) 0
c = foldr (\y xs -> y - 10 * xs) 0
d = foldr (\y xs -> 10 * xs -y) 0
The results will be -123, 83, 281, and -321 respectively.
Why is this the case? I know that when you pass [1,2,3,4] into a function defined as
f = foldl (xs x -> xs ++ [f x]) []
it gets expanded to ((([] ++ [1]) ++ [2]) ++ [3]) ++ [4]
In the same vein, What do the above functions a, b, c, and d get expanded to?

I think the two images on Haskell Wiki's fold page explain it quite nicely.
Since your operations are not commutative, the results of foldr and foldl will not be the same, whereas in a commutative operation they would:
Prelude> foldl1 (*) [1..3]
6
Prelude> foldr1 (*) [1..3]
6
Using scanl and scanr to get a list including the intermediate results is a good way to see what happens:
Prelude> scanl1 (*) [1..3]
[1,2,6]
Prelude> scanr1 (*) [1..3]
[6,6,3]
So in the first case we have (((1 * 1) * 2) * 3), whereas in the second case it's (1 * (2 * (1 * 3))).

foldr is a really simple function idea: get a function which combines two arguments, get a starting point, a list, and compute the result of calling the function on the list in that way.
Here's a nice little hint about how to imagine what happens during a foldr call:
foldr (+) 0 [1,2,3,4,5]
=> 1 + (2 + (3 + (4 + (5 + 0))))
We all know that [1,2,3,4,5] = 1:2:3:4:5:[]. All you need to do is replace [] with the starting point and : with whatever function we use. Of course, we can also reconstruct a list in the same way:
foldr (:) [] [1,2,3]
=> 1 : (2 : (3 : []))
We can get more of an understanding of what happens within the function if we look at the signature:
foldr :: (a -> b -> b) -> b -> [a] -> b
We see that the function first gets an element from the list, then the accumulator, and returns what the next accumulator will be. With this, we can write our own foldr function:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f a [] = a
foldr f a (x:xs) = f x (foldr f a xs)
And there you are; you should have a better idea as to how foldr works, so you can apply that to your problems above.

The fold* functions can be seen as looping over the list passed to it, starting from either the end of the list (foldr), or the start of the list (foldl). For each of the elements it finds, it passes this element and the current value of the accumulator to what you have written as a lambda function. Whatever this function returns is used as the value of the accumulator in the next iteration.
Slightly changing your notation (acc instead of xs) to show a clearer meaning, for the first left fold
a = foldl (\acc y -> 10*acc - y) 0 [1, 2, 3]
= foldl (\acc y -> 10*acc - y) (0*1 - 1) [2, 3]
= foldl (\acc y -> 10*acc - y) -1 [2, 3]
= foldl (\acc y -> 10*acc - y) (10*(-1) - 2) [3]
= foldl (\acc y -> 10*acc - y) (-12) [3]
= foldl (\acc y -> 10*acc - y) (10*(-12) - 3) []
= foldl (\acc y -> 10*acc - y) (-123) []
= (-123)
And for your first right fold (note the accumulator takes a different position in the arguments to the lambda function)
c = foldr (\y acc -> y - 10*acc) 0 [1, 2, 3]
= foldr (\y acc -> y - 10*acc) (3 - 10*0) [1, 2]
= foldr (\y acc -> y - 10*acc) 3 [1, 2]
= foldr (\y acc -> y - 10*acc) (2 - 10*3) [1]
= foldr (\y acc -> y - 10*acc) (-28) [1]
= foldr (\y acc -> y - 10*acc) (1 - 10*(-28)) []
= foldr (\y acc -> y - 10*acc) 281 []
= 281

Related

Haskell: Understanding the foldl function

I am learning about folds from 'Learn You a Haskell for Great Good!' by Miran Lipovaca.
For the following example which uses foldl:
sum' :: (Num a) => [a] -> a
sum' xs = foldl (\acc x -> acc + x) 0 xs
ghci> sum' [3,5,2,1]
11
I understand that acc is the accumulator and x is the starting value (the first value from the list xs). I don't quite understand how 0 and xs are passed into the lambda function as parameters - how does the function know that the value of acc is 0 and the value of x is 3? Any insights are appreciated.
Recall the definition of foldl:
foldl f acc [] = acc
foldl f acc (x:xs) = foldl f (f acc x) xs
Now, the best way to understand folds is to walk through the evaluation. So let's start with:
sum [3,5,2,1]
== foldl (\acc x -> acc + x) 0 [3,5,2,1]
The second line of the definition of the foldl function means this is equivalent to the following:
== foldl (\acc x -> acc + x) ((\acc x -> acc + x) 0 3) [5,2,1]
Now since the lambda expression is applied to parameters, 0 and 3 are passed in as acc and x:
== foldl (\acc x -> acc + x) (0+3) [5,2,1]
And the process repeats:
== foldl (\acc x -> acc + x) ((\acc x -> acc + x) (0+3) 5) [2,1]
== foldl (\acc x -> acc + x) ((0+3)+5) [2,1]
== foldl (\acc x -> acc + x) ((\acc x -> acc + x) ((0+3)+5) 2) [1]
== foldl (\acc x -> acc + x) (((0+3)+5)+2) [1]
== foldl (\acc x -> acc + x) ((\acc x -> acc + x) (((0+3)+5)+2) 1) []
== foldl (\acc x -> acc + x) ((((0+3)+5)+2)+1) []
At this point, evaluation continues according to the first line of the foldl definition:
== ((((0+3)+5)+2)+1)
So to answer your question directly: the function knows the values of acc and x simply because the definition of foldl passes their values to the function as parameters.
It would be helpful to look at how the foldl function is defined:
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f a [] = a
foldl f a (x:xs) = foldl f (f a x) xs
So, if the input list is empty then we just return the accumulator value a. However, if it's not empty then we loop. Within the loop, we update the accumulator value to f a x (i.e. we apply the lambda function f to the current accumulator value and the current element of the list). The result is the new accumulator value.
We also update the value of the list in the loop by removing its first element (because we just processed the first element). We keep processing the remaining elements of the list until there are no elements left, at which point we return the value of the accumulator.
The foldl function is equivalent to a for loop in imperative languages. For example, here's how we could implement foldl in JavaScript:
const result = foldl((acc, x) => acc + x, 0, [3,5,2,1]);
console.log(result);
function foldl(f, a, xs) {
for (const x of xs) a = f(a, x);
return a;
}
Hope that elucidates the foldl function.

Understanding the work of foldr and unfoldr

I have hard time understanding how these bits of code work.
"Map" function must apply the function to all elements in given list, and generate list consist of results of applying. So we are giving our function f and some list, then in lambda expression our list transforms into head "x" and tail "xs", we applying function "f" to x and append it to "xs". But what happens next? How and what exactly foldr takes for its second argument (which must be some starting value usually). And for what purpose empty list?
And function "rangeTo" : we are creating lambda expression, where we are checking that we are over the end of range, end if we are than we are giving Nothing, or if we are not at end, we are giving pair where first number append to resulting list, and second number used as next value for "from". Is it all what happens in this function, or I'm missing something?
--custom map function through foldr
map :: (a -> b) -> [a] -> [b]
map f = foldr (\x xs -> f x : xs) []
--function to create list with numbers from first argument till second and step "step"
rangeTo :: Integer -> Integer -> Integer -> [Integer]
rangeTo from to step = unfoldr (\from -> if from >= to then Nothing else Just (from, from+step)) from
To understand How foldr operates on a list. It is better to write down the definition of foldr as
foldr step z xs
= x1 `step` foldr step z xs1 -- where xs = x:xs1
= x1 `step` (x2 `step` foldr step z xs2) -- where xs = x1:x2:xs2
= x1 `step` (x2 `step` ... (xn `step` foldr step z [])...) -- where xs = x1:x2...xn:[]
and
foldr step z [] = z
For your case:
foldr (\x xs -> f x : xs) []
where
step = (\x xs -> f x : xs)
z = []
From the definition of foldr, the innermost expression
(xn `step` foldr step z [])
is evaluated first, that is
xn `step` foldr step z []
= step xn (foldr step z [])
= step xn z
= step xn [] -- z = []
= f xn : [] -- step = (\x xs -> f x : xs)
= [f xn]
what happens next? The evaluation going on as
x(n-1) `step` (xn `step` foldr step z [])
= step x(n-1) [f xn]
= f x(n-1) : [f xn]
= [f x(n-1), f xn]
untill:
x1 `step` (x2 ...
= step x1 [f x2, ..., f xn]
= [f x1, f x2, ... f xn]
So we are giving our function f and some list, then in lambda expression our list transforms into head "x" and tail "xs", we applying function "f" to x and append it to "xs".
This is not the case. Look closely at the implementation:
map :: (a -> b) -> [a] -> [b]
map f = foldr (\x xs -> f x : xs) []
There is an implied variable here, we can add it back in:
map :: (a -> b) -> [a] -> [b]
map f ls = foldr (\x xs -> f x : xs) [] ls
map takes two arguments, a function f and a list ls. It passes ls to foldr as the list to fold over, and it passes [] as the starting accumulator value. The lambda takes a list element x and an accumulator xs (initially []), and returns a new accumulator f x : xs. It does not perform a head or tail anywhere; x and xs were never part of the same list.
Let's step through the evaluation to see how this function works:
map (1+) [2, 4, 8]
foldr (\x xs -> (1+) x : xs) [] [2, 4, 8] -- x = 8, xs = []
foldr (\x xs -> (1+) x : xs) [9] [2, 4] -- x = 4, xs = [9]
foldr (\x xs -> (1+) x : xs) [5, 9] [2] -- x = 2, xs = [5, 9]
foldr (\x xs -> (1+) x : xs) [3, 5, 9] [] -- xs = [3, 5, 9]
map (1+) [2, 4, 8] == [3, 5, 9]
The empty list accumulates values passed through f, starting from the right end of the input list.
And function "rangeTo" : we are creating lambda expression, where we are checking that we are over the end of range, end if we are than we are giving Nothing, or if we are not at end, we are giving pair where first number append to resulting list, and second number used as next value for "from". Is it all what happens in this function, or I'm missing something?
Yes, that's exactly what's going on. The lambda takes an accumulator, and returns the next value to put in the list and a new accumulator, or Nothing if the list should end. The accumulator in this case is the current value in the list. The list should end if that value is past the end of the range. Otherwise it calculates the next accumulator by adding the step.
Again, we can step through the evaluation:
rangeTo 3 11 2 -- from = 3, to = 11, step = 2
Just (3, 5) -- from = 3
Just (5, 7) -- from = 3 + step = 5
Just (7, 9) -- from = 5 + step = 7
Just (9, 11) -- from = 7 + step = 9
Nothing -- from = 9 + step = 11, 11 >= to
rangeTo 3 11 2 == [3, 5, 7, 9]

How does fold distinguish x from xs in Haskell?

sum' :: (Num a) => [a] -> a
sum' xs = foldl (\acc x -> acc + x) 0 xs
There is no pattern like x:xs. xs is a list. In the lambda function, how does the expression acc + x knows that x is the element in xs?
There is no pattern like x:xs. xs is a list. In the lambda function, how does the expression acc + x knows that x is the element in xs?
In Haskell - like in many programming languages - the name of a variable does not matter. For Haskell it does not matter if you write xs, or x, or acc, or use another identifier. What matters here is actually the position of the arguments.
The foldl :: (a -> b -> a) -> a -> [b] -> a is a function that takes as input a function with type a -> b -> a, followed by an object of type a, followed by a list of elements of type b, and returns an object of type a.
Semantically the second parameter of the function, will be the elements of the list. If you thus wrote \x acc -> x + acc, acc would be the eleemnts of the list, and x the accumulator.
The reason why this binds is because foldl is implemented like:
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
It thus is defined itself in Haskell, and thus binds the function to f, the initial element to z, and performs recursion to eventually obtain the result by making a recurslive call where we take the tail of the list, and use (f z x) as new initial value until the list is exhausted.
You can write the sum more elegant as:
sum' :: Num n => [n] -> n
sum' = foldl (+) 0
so here there are no explicit variables in use at all.
It doesn't "know" anything like that - there's no magic going on here.
The definition of foldl is equivalent to:
foldl f acc (x:xs) = foldl f (f acc x) xs
foldl _ acc [] = acc
So going through a simple example using your sum' function:
We start with
sum' [1,2,3]
substituting the definition of sum' we get
foldl (\acc x -> acc + x) 0 [1,2,3]
substituting the definition of foldl (first case):
foldl (\acc x -> acc + x) ((\acc x -> acc + x) 0 1) [2,3]
evaluation the function application of your lambda, we get
foldl (\acc x -> acc + x) (0 + 1) [2,3]
substituting foldl again...
foldl (\acc x -> acc + x) ((\acc x -> acc + x) (0+1) 2) [3]
and evaluating the accumulator:
foldl (\acc x -> acc + x) ((0 + 1) + 2) [3]
and substituting foldl again...
foldl (\acc x -> acc + x) ((\acc x -> acc + x) ((0 + 1) + 2) 3) []
again, evaluating the accumulator:
foldl (\acc x -> acc + x) (((0 + 1) + 2) + 3) []
now we get to the second (terminating) case of foldl because we apply it to an empty list and are left with only:
(((0 + 1) + 2 ) + 3)
which we can of course evaluate to get 6.
As you can see, there's no magic involved here: x is just a name you gave to a function argument. You could've named it user8314628 instead and it would've worked the same way. What's binding the value of the head of the list to that argument isn't any pattern matching you do yourself, but what foldl actually does with the list.
Note that you can evaluate any haskell expression using this step-by-step process; You usually won't have to, but it's useful to do this a couple of times with functions that do more-or-less complicated things and you are unfamiliar with.
how does the expression acc + x knows that x is the element in xs?
It doesn't. It computes a sum of whatever is passed to it.
Note that (\acc x -> acc + x) can be written simply as (+).
Folds take each consecutive values of the input list while making passing the remainder back to a function transparent. If you were to write your own sum’ function, you would have to pass the remainder back to your function. You would also have to pass an accumulator back to your own function to keep a running total. Fold does not make explicit the processing of a list by taking the first value and passing the remainder. What it does explicate is the accumulator. It does also have to keep a running total in the case of a sum function. The accumulator is explicit because some recursive functions may do different things with it.

How does fold works for empty list?

When we fold a list with one or more elements inside as done below:
foldr (+) 0 [1,2,3]
We get:
foldr (+) 0 (1 : 2 : 3 : [])
foldr (+) 1 + (2 +(3 + 0)) // 6
Now when the list is empty:
foldr (+) 0 []
Result: foldr (+) 0 ([])
Since (+) is binary operator, it needs two arguments to complete but here we end up (+) 0. How does it result in 0 and not throwing error of partially applied function.
Short answer: you get the initial value z.
If you give foldl or foldr an empty list, then it returns the initial value. foldr :: (a -> b -> b) -> b -> t a -> b works like:
foldr f z [x1, x2, ..., xn] == x1 `f` (x2 `f` ... (xn `f` z)...)
So since there are no x1, ..., xn the function is never applied, and z is returned.
We can also inspect the source code:
foldr :: (a -> b -> b) -> b -> [a] -> b
-- foldr _ z [] = z
-- foldr f z (x:xs) = f x (foldr f z xs)
{-# INLINE [0] foldr #-}
-- Inline only in the final stage, after the foldr/cons rule has had a chance
-- Also note that we inline it when it has *two* parameters, which are the
-- ones we are keen about specialising!
foldr k z = go
where
go [] = z
go (y:ys) = y `k` go ys
So if we give foldr an empty list, then go will immediately work on that empty list, and return z, the initial value.
A cleaner syntax (and a bit less efficient, as is written in the comment of the function) would thus be:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr _ z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Note that - depending on the implementation of f - it is possible to foldr on infinite lists: if at some point f only looks at the initial value, and then returns a value, then the recursive part can be dropped.

Where do theses values come from in this haskell function?

Let's say I have the following function:
sumAll :: [(Int,Int)] -> Int
sumAll xs = foldr (+) 0 (map f xs)
where f (x,y) = x+y
The result of sumAll [(1,1),(2,2),(3,3)] will be 12.
What I don't understand is where the (x,y) values are coming from. Well, I know they come from the xs variable but I don't understand how. I mean, doing the code above directly without the where keyword, it would be something like this:
sumAll xs = foldr (+) 0 (map (\(x,y) -> x+y) xs)
And I can't understand, in the top code, how does the f variable and (x,y) variables represent the (\(x,y) -> x+y) lambda expression.
Hopefully this will help. The key is that f is applied to the elements of the list, which are pairs.
sumAll [(1,1),(2,2),(3,3)]
-- definition of sumAll
= foldr (+) 0 (map f [(1,1),(2,2),(3,3)])
-- application of map
= foldr (+) 0 (f (1,1) : map f [(2,2),(3,3)])
-- application of foldr
= 0 + foldr (+) (f (1,1)) (map f [(2,2),(3,3)])
-- application of map
= 0 + foldr (+) (f (1,1)) (f (2,2) : map f [(3,3)])
-- application of foldr
= 0 + (f (1,1) + foldr (+) (f (2,2)) (map f [(3,3)]))
-- application of f
= 0 + (2 + foldr (+) (f (2,2)) (map f [(3,3)]))
-- application of map
= 0 + (2 + foldr (+) (f (2,2)) (f (3,3) : map f []))
-- application of foldr
= 0 + (2 + (f (2,2) + foldr (+) (f (3,3)) (map f [])))
-- application of f
= 0 + (2 + (4 + foldr (+) (f (3,3)) (map f [])))
-- application of map
= 0 + (2 + (4 + foldr (+) (f (3,3)) []))
-- application of foldr
= 0 + (2 + (4 + f (3,3)))
-- application of f
= 0 + (2 + (4 + 6))
= 0 + (2 + 10)
= 0 + 12
= 12
In Haskell, functions are first class datatypes.
This means you can pass functions around like other types of data such as integers and strings.
In your code above you declare 'f' to be a function, which takes in one argumenta (a tuple of two values (x,y)) and returns the result of (x + y).
foldr is another function which takes in 3 arguments, a binary function (in this case +) a starting value (0) and an array of values to iterator over.
In short 'where f (x,y) = x + y' is just scoped shorthand for
sumAll :: [(Int,Int)] -> Int
sumAll xs = foldr (+) 0 (map myFunctionF xs)
myFunctionF :: (Int,Int) -> Int
myFunctionF (x,y) = x + y
Edit: If your unsure about how foldr works, check out Haskell Reference Zvon
Below is an example implementation of foldl / map.
foldl :: (a -> b -> b) -> b -> [a] -> b
foldl _ x [] = x
foldl fx (y:ys) = foldl f (f y x) ys
map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = (f x) : (map f xs)
Not an answer, but I thought I should point out that your function f:
f (x, y) = x + y
can be expressed as
f = uncurry (+)

Resources