Can anybody explain how does foldr work?
Take these examples:
Prelude> foldr (-) 54 [10, 11]
53
Prelude> foldr (\x y -> (x+y)/2) 54 [12, 4, 10, 6]
12.0
I am confused about these executions. Any suggestions?
The easiest way to understand foldr is to rewrite the list you're folding over without the sugar.
[1,2,3,4,5] => 1:(2:(3:(4:(5:[]))))
now what foldr f x does is that it replaces each : with f in infix form and [] with x and evaluates the result.
For example:
sum [1,2,3] = foldr (+) 0 [1,2,3]
[1,2,3] === 1:(2:(3:[]))
so
sum [1,2,3] === 1+(2+(3+0)) = 6
foldr begins at the right-hand end of the list and combines each list entry with the accumulator value using the function you give it. The result is the final value of the accumulator after "folding" in all the list elements. Its type is:
foldr :: (a -> b -> b) -> b -> [a] -> b
and from this you can see that the list element (of type a) is the first argument to the given function, and the accumulator (of type b) is the second.
For your first example:
Starting accumulator = 54
11 - 54 = -43
10 - (-43) = 53
^ Result from the previous line
^ Next list item
So the answer you got was 53.
The second example:
Starting accumulator = 54
(6 + 54) / 2 = 30
(10 + 30) / 2 = 20
(4 + 20) / 2 = 12
(12 + 12) / 2 = 12
So the result is 12.
Edit: I meant to add, that's for finite lists. foldr can also work on infinite lists but it's best to get your head around the finite case first, I think.
It helps to understand the distinction between foldr and foldl. Why is foldr called "fold right"?
Initially I thought it was because it consumed elements from right to left. Yet both foldr and foldl consume the list from left to right.
foldl evaluates from left to right (left-associative)
foldr evaluates from right to left (right-associative)
We can make this distinction clear with an example that uses an operator for which associativity matters. We could use a human example, such as the operator, "eats":
foodChain = (human : (shark : (fish : (algae : []))))
foldl step [] foodChain
where step eater food = eater `eats` food -- note that "eater" is the accumulator and "food" is the element
foldl `eats` [] (human : (shark : (fish : (algae : []))))
== foldl eats (human `eats` shark) (fish : (algae : []))
== foldl eats ((human `eats` shark) `eats` fish) (algae : [])
== foldl eats (((human `eats` shark) `eats` fish) `eats` algae) []
== (((human `eats` shark) `eats` fish) `eats` algae)
The semantics of this foldl is: A human eats some shark, and then the same human who has eaten shark then eats some fish, etc. The eater is the accumulator.
Contrast this with:
foldr step [] foodChain
where step food eater = eater `eats` food. -- note that "eater" is the element and "food" is the accumulator
foldr `eats` [] (human : (shark : (fish : (algae : []))))
== foldr eats (human `eats` shark) (fish : (algae : []))))
== foldr eats (human `eats` (shark `eats` (fish)) (algae : [])
== foldr eats (human `eats` (shark `eats` (fish `eats` algae))) []
== (human `eats` (shark `eats` (fish `eats` algae)
The semantics of this foldr is: A human eats a shark which has already eaten a fish, which has already eaten some algae. The food is the accumulator.
Both foldl and foldr "peel off" eaters from left to right, so that's not the reason we refer to foldl as "left fold". Instead, the order of evaluation matters.
Think about foldr's very definition:
-- if the list is empty, the result is the initial value z
foldr f z [] = z
-- if not, apply f to the first element and the result of folding the rest
foldr f z (x:xs) = f x (foldr f z xs)
So for example foldr (-) 54 [10,11] must equal (-) 10 (foldr (-) 54 [11]), i.e. expanding again, equal (-) 10 ((-) 11 54). So the inner operation is 11 - 54, that is, -43; and the outer operation is 10 - (-43), that is, 10 + 43, therefore 53 as you observe. Go through similar steps for your second case, and again you'll see how the result forms!
foldr means fold from the right, so foldr (-) 0 [1, 2, 3] produces (1 - (2 - (3 - 0))). In comparison foldl produces (((0 - 1) - 2) - 3).
When the operators are not commutative foldl and foldr will get different results.
In your case, the first example expands to (10 - (11 - 54)) which gives 53.
An easy way to understand foldr is this: It replaces every list constructor with an application of the function provided. Your first example would translate to:
10 - (11 - 54)
from:
10 : (11 : [])
A good piece of advice that I got from the Haskell Wikibook might be of some use here:
As a rule you should use foldr on lists that might be infinite or where the fold is building up a data structure, and foldl' if the list is known to be finite and comes down to a single value. foldl (without the tick) should rarely be used at all.
I've always thought http://foldr.com to be a fun illustration. See the Lambda the Ultimate post.
Careful readings of -- and comparisons between -- the other answers provided here should already make this clear, but it's worth noting that the accepted answer might be a bit misleading to beginners. As other commenters have noted, the computation foldr performs in Haskell does not "begin at the right hand end of the list"; otherwise, foldr could never work on infinite lists (which it does in Haskell, under the right conditions).
The source code for Haskell's foldr function should make this clear:
foldr k z = go
where
go [] = z
go (y:ys) = y `k` go ys
Each recursive computation combines the left-most atomic list item with a recursive computation over the tail of the list, viz:
a\[1\] `f` (a[2] `f` (a[3] `f` ... (a[n-1] `f` a[n]) ...))
where a[n] is the initial accumulator.
Because reduction is done "lazily in Haskell," it actually begins at the left. This is what we mean by "lazy evaluation," and it's famously a distinguishing feature of Haskell. And it's important in understanding the operation of Haskell's foldr; because, in fact, foldr builds up and reduces computations recursively from the left, binary operators that can short-circuit have an opportunity to, allowing infinite lists to be reduced by foldr under appropriate circumstances.
It will lead to far less confusion to beginners to say rather that the r ("right") and l ("left") in foldr and foldl refer to right associativity and left associativity and either leave it at that, or try and explain the implications of Haskell's lazy evaluation mechanism.
To work through your examples, following the foldr source code, we build up the following expression:
Prelude> foldr (-) 54 [10, 11]
->
10 - [11 - 54] = 53
And again:
foldr (\x y -> (x + y) / 2) 54 [12, 4, 10, 6]
->
(12 + (4 + (10 + (6 + 54) / 2) / 2) / 2) / 2 = 12
I think that implementing map, foldl and foldr in a simple fashion helps explain how they work. Worked examples also aid in our understanding.
myMap f [] = []
myMap f (x:xs) = f x : myMap f xs
myFoldL f i [] = i
myFoldL f i (x:xs) = myFoldL f (f i x) xs
> tail [1,2,3,4] ==> [2,3,4]
> last [1,2,3,4] ==> 4
> head [1,2,3,4] ==> 1
> init [1,2,3,4] ==> [1,2,3]
-- where f is a function,
-- acc is an accumulator which is given initially
-- l is a list.
--
myFoldR' f acc [] = acc
myFoldR' f acc l = myFoldR' f (f acc (last l)) (init l)
myFoldR f z [] = z
myFoldR f z (x:xs) = f x (myFoldR f z xs)
> map (\x -> x/2) [12,4,10,6] ==> [6.0,2.0,5.0,3.0]
> myMap (\x -> x/2) [12,4,10,6] ==> [6.0,2.0,5.0,3.0]
> foldl (\x y -> (x+y)/2) 54 [12, 4, 10, 6] ==> 10.125
> myFoldL (\x y -> (x+y)/2) 54 [12, 4, 10, 6] ==> 10.125
foldl from above: Starting accumulator = 54
(12 + 54) / 2 = 33
(4 + 33) / 2 = 18.5
(10 + 18.5) / 2 = 14.25
(6 + 14.25) / 2 = 10.125`
> foldr (++) "5" ["1", "2", "3", "4"] ==> "12345"
> foldl (++) "5" ["1", "2", "3", "4"] ==> “51234"
> foldr (\x y -> (x+y)/2) 54 [12,4,10,6] ==> 12
> myFoldR' (\x y -> (x+y)/2) 54 [12,4,10,6] ==> 12
> myFoldR (\x y -> (x+y)/2) 54 [12,4,10,6] ==> 12
foldr from above: Starting accumulator = 54
(6 + 54) / 2 = 30
(10 + 30) / 2 = 20
(4 + 20) / 2 = 12
(12 + 12) / 2 = 12
Ok, lets look at the arguments:
a function (that takes a list element and a value (a possible partial result) of the same kind of the value it returns);
a specification of the initial result for the empty list special case
a list;
return value:
some final result
It first applies the function to the last element in the list and the empty list result. It then reapplies the function with this result and the previous element, and so forth until it takes some current result and the first element of the list to return the final result.
Fold "folds" a list around an initial result using a function that takes an element and some previous folding result. It repeats this for each element. So, foldr does this starting at the end off the list, or the right side of it.
folr f emptyresult [1,2,3,4] turns into
f(1, f(2, f(3, f(4, emptyresult) ) ) ) . Now just follow parenthesis in evaluation and that's it.
One important thing to notice is that the supplied function f must handle its own return value as its second argument which implies both must have the same type.
Source: my post where I look at it from an imperative uncurried javascript perspective if you think it might help.
The images in this wiki page visualize the idea of foldr (and foldl also):
https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29
For example, the result of foldr (-) 0 [1,2,3] is 2. It can be visualized as:
-
/ \
1 -
/ \
2 -
/ \
3 0
That is (from bottom to the top):
1 - ( -1 ) = 2
2 - ( 3 )
3 - 0
So foldr (\x y -> (x+y)/2) 54 [12, 4, 10, 6] is being computed through:
12 `f` (12.0) = 12.0
4 `f` (20.0)
10 `f` (30.0)
6 `f` 54
Related
On my functional programming exam, I had the following question:
How many times is (+ 1) function computed in the following code?
(map (+ 1) [1 .. 10]) !! 5
where the index function is defined like this:
(h:_) !! 0 = h
(_:t) !! x = t !! (x-1)
I would say 6 times, but the correct answer seems to be 1, and I cannot understand why. I could not find a good enough explanation of lazy evaluation in Haskell, so I would like to know what is the correct answer and why. Thank you in advance!
many times is (+ 1) function computed in the following code?
It is calculated only once. map does not force to calculate f xi on the elements in the result list. These calculations are postponed (just like everything else in Haskell), only when we need to calculate the value of a specific item, we do that.
map is specified in chapter 9 of the Haskell'10 report as:
-- Map and append
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
There are no seq, bang patterns, etc. here to force evaluation of f x, so the map function will indeed "yield" an f x, but without evaluating f x, it is postponed until it is necessary (and it might happen that we are not interested in some of these values, and thus can save some CPU cycles).
We can take a look how Haskell will evaluate this:
(!!) (map (+ 1) [1 .. 10]) 5
-> (!!) ((+1) 1 : map (+1) [2..10]) 5
-> (!!) (map (+1) [2..10]) 4
-> (!!) ((+1) 1 : map (+1) [3..10]) 4
-> (!!) (map (+1) [3..10]) 3
-> (!!) ((+1) 1 : map (+1) [4..10]) 3
-> (!!) (map (+1) [4..10]) 2
-> (!!) ((+1) 1 : map (+1) [5..10]) 2
-> (!!) (map (+1) [5..10]) 1
-> (!!) ((+1) 1 : map (+1) [6..10]) 1
-> (!!) (map (+1) [6..10]) 0
-> (!!) ((+1) 6 : map (+1) [7..10]) 0
-> (+1) 6
-> 7
This is because map f [x1, x2, ..., xn] eventually maps to a list [f x1, f x2, ..., f xn], but it does not compute f xi of the elements, that computation is postponed until we actually would need the value in that list, and do something with it (like priting it).
This can result in a significant performance boost, given f is an expensive function, and we only need the value of a small amount of elements in the list.
Let's test it by doing something horrible. You'll need to import the Debug.Trace module for this.
ghci> (map (\x -> trace "Performing..." (x + 1)) [1..10]) !! 5
Now, we'll get that totally safe IO action to happen every time the lambda expression is called. When we run this in GHCi, we get
Performing
7
So only once.
As a sanity check, we could remove the !! 5 bit.
ghci> map (\x -> trace "Performing..." (x + 1)) [1..10]
[Performing
2,Performing
3,Performing
4,Performing
5,Performing
6,Performing
7,Performing
8,Performing
9,Performing
10,Performing
11]
So it's definitely happening 10 times when we ask for the whole list.
First thing, I understand (almost) fold functions. Given the function I can work out easily what will happen and how to use it.
The question is about the way it is implemented which leads to slight difference in the function definition which took some time to understand.To make matters worse most example for folds have same type of the list and default case, which does not help in the understranding as these can be different.
Usage:
foldr f a xs
foldl f a xs
where a is the default case
definition:
foldr: (a -> b -> b) -> b -> [a] -> b
foldl: (a -> b -> a) -> a -> [b] -> a
In definition I understand a is the first variable to be passed and b second variable to be passed to function.
Eventually I understood that this is happening due to the fact that when f finally gets evaluated in foldr it is implemented as f x a (i.e. default case is passed as second parameter). But for foldl it is implemented as f a x (i.e. default case is passed as first parameter).
Would not the function definition be same if we had passed the default case as same (either 1st parameter in both or 2nd) in both cases? Was there any particular reason for this choice?
To make things a little clearer, I will rename a couple type variables in your foldl signature...
foldr: (a -> b -> b) -> b -> [a] -> b
foldl: (b -> a -> b) -> b -> [a] -> b
... so that in both cases a stands for the type of the list elements, and b for that of the fold results.
The key difference between foldr and foldl can be seen by expanding their recursive definitions. The applications of f in foldr associate to the right, and the initial value shows up to the right of the elements:
foldr f a [x,y,z] = x `f` (y `f` (z `f` a))
With foldl, it is the other way around: the association is to the left, and the initial value shows up to the left (as Silvio Mayolo emphasises in his answer, that's how it has to be so that the initial value is in the innermost sub-expression):
foldl f a [x,y,z] = ((a `f` x) `f` y) `f` z
That explains why the list element is the first argument to the function given to foldr, and the second to the one given to foldl. (One might, of course, give foldl the same signature of foldr and then use flip f instead of f when defining it, but that would achieve nothing but confusion.)
P.S.: Here is a good, simple example of folds with the types a and b different from each other:
foldr (:) [] -- id
foldl (flip (:)) [] -- reverse
A fold is a type of catamorphism, or a way of "tearing down" a data structure into a scalar. In our case, we "tear down" a list. Now, when working with a catamorphism, we need to have a case for each data constructor. Haskell lists have two data constructors.
[] :: [a]
(:) :: a -> [a] -> [a]
That is, [] is a constructor which takes no arguments and produces a list (the empty list). (:) is a constructor which takes two arguments and makes a list, prepending the first argument onto the second. So we need to have two cases in our fold. foldr is the direct example of a catamorphism.
foldr :: (a -> b -> b) -> b -> [a] -> b
The first function will be called if we encounter the (:) constructor. It will be passed the first element (the first argument to (:)) and the result of the recursive call (calling foldr on the second argument of (:)). The second argument, the "default case" as you call it, is for when we encounter the [] constructor, in which case we simply use the default value itself. So it ends up looking like this
foldr (+) 4 [1, 2, 3]
1 + (2 + (3 + 4))
Now, could we have designed foldl the same way? Sure. foldl isn't (exactly) a catamorphism, but it behaves like one in spirit. In foldr, the default case is the innermost value; it's only used at the "last step" of the recursion, when we've run out of list elements. In foldl, we do the same thing for consistency.
foldl (+) 4 [1, 2, 3]
((4 + 1) + 2) + 3
Let's break that down in more detail. foldl can be thought of as using an accumulator to get the answer efficiently.
foldl (+) 4 [1, 2, 3]
foldl (+) (4 + 1) [2, 3]
foldl (+) ((4 + 1) + 2) [3]
foldl (+) (((4 + 1) + 2) + 3) []
-- Here, we've run out of elements, so we use the "default" value.
((4 + 1) + 2) + 3
So I suppose the short answer to your question is that it's more consistent (and more useful), mathematically speaking, to make sure the base case is always at the innermost position in the recursive call, rather than focusing on it being on the left or the right all the time.
Consider the calls foldl (+) 0 [1,2,3,4] and foldr (+) 0 [1,2,3,4] and try to visualize what they do:
foldl (+) 0 [1,2,3,4] = ((((0 + 1) + 2) + 3) + 4)
foldr (+) 0 [1,2,3,4] = (0 + (1 + (2 + (3 + 4))))
Now, let's try to swap the arguments to the call to (+) in each step:
foldl (+) 0 [1,2,3,4] = (4 + (3 + (2 + (1 + 0))))
Note that despite the symmetry this is not the same as the previous foldr. We are still accumulating from the left of the list, I've just changed the order of operands.
In this case, because addition is commutative, we get the same result, but if you try to fold over some non-commutative function, e.g. string concatenation, the result is different. Folding over ["foo", "bar", "baz"], you would obtain "foobarbaz" or "bazbarfoo" (while a foldr would result in "foobarbaz" as well because string concatenation is associative).
In other words, the two definitions as they are make the two functions have the same result for commutative and associative binary operations (like common arithmetic addition/multiplication). Swapping the arguments to the accumulating function breaks this symmetry and forces you to use flip to recover the symmetric behavior.
The two folds yield different results due to their opposite associativity. The base value always shows up within the inner most parens. List traversal happens the same way for both folds.
right fold with (+) using the prefix notation
foldr (+) 10 [1,2,3]
=> + 1 (+ 2 (+ 3 10))
=> + 1 (+ 2 13)
=> + 1 15
=> 16
foldl (+) 10 [1,2,3]
=> + (+ (+ 10 1) 2) 3
=> + (+ 11 2) 3
=> + 13 3
=> 16
both folds evaluate to the same result because (+) is commutative, i.e.
+ a b == + b a
lets see what happens when the function is not commutative, e.g. division or exponentiation
foldl (/) 1 [1, 2, 3]
=> / (/ (/ 1 1) 2) 3
=> / (/ 1 2) 3
=> / 0.5 3
=> 0.16666667
foldr (/) 1 [1, 2, 3]
=> / 1 (/ 2 (/ 3 1))
=> / 1 (/ 2 3)
=> / 1 0.666666667
=> 1.5
now, lets evaluate foldr with function flip (/)
let f = flip (/)
foldr f 1 [1, 2, 3]
=> f 1 (f 2 (f 3 1))
=> f 1 (f 2 0.3333333)
=> f 1 0.16666667
=> 0.16666667
similarly, lets evaluate foldl with f
foldl f 1 [1, 2, 3]
=> f (f (f 1 1) 2) 3
=> f (f 1 2) 3
=> f 2 3
=> 1.5
So, in this case, flipping the order of the arguments of the folding function can make left fold return the same value as a right fold and vice versa. But that is not guaranteed. Example:
foldr (^) 1 [1, 2, 3] = 1
foldl (^) 1 [1, 2, 3] = 1
foldr (flip (^)) 1 [1,2,3] = 1
foldl (flip (^)) 1 [1,2,3] = 9 -- this is the odd case
foldl (flip (^)) 1 $ reverse [1,2,3] = 1
-- we again get 1 when we reverse this list
incidentally, reverse is equivalent to
foldl (flip (:)) []
but try defining reverse using foldr
I'm trying to define a function in Haskell using the foldr function:
fromDigits :: [Int] -> Int
This function takes a list of Ints (each on ranging from 0 to 9) and converts to a single Int. For example:
fromDigits [0,1] = 10
fromDigits [4,3,2,1] = 1234
fromDigits [2,3,9] = 932
fromDigits [2,3,9,0,1] = 10932
Anyway, I have no trouble defining this using explicit recursion or even using zipWith:
fromDigits n = sum (zipWith (*) n (map ((^)10) [0..]))
But now I have to define it using a foldr, but I don't know how to get the powers of 10. What I have is:
fromDigits xs = foldr (\x acc -> (x*10^(???)) + acc) 0 xs
How can I get them to decrease? I know I can start with (length xs - 1) but what then?
Best Regards
You were almost there:
your
fromDigits xs = foldr (\x acc -> (x*10^(???)) + acc) 0 xs
is the solution with 2 little changes:
fromDigits = foldr (\x acc -> acc*10 + x) 0
(BTW I left out the xs on each sides, that's not necessary.
Another option would be
fromDigits = foldl (\x acc -> read $ (show x) ++ (show acc)) 0
The nice thing about foldr is that it's so extemely easy to visualise!
foldr f init [a,b, ... z]
≡ foldr f init $ a : b : ... z : []
≡ a`f b`f`... z`f`init
≡ f a (f b ( ... (f z init)...)))
so as you see, the j-th list element is used in j consecutive calls of f. The head element is merely passed once to the left of the function. For you application, the head element is the last digit. How should that influence the outcome? Well, it's just added to the result, isn't it?
15 = 10 + 5
623987236705839 = 623987236705830 + 9
– obvious. Then the question is, how do you take care for the other digits? Well, to employ the above trick you first need to make sure there's a 0 in the last place of the carried subresult. A 0 that does not come from the supplied digits! How do you add such a zero?
That should really be enough hint given now.
The trick is, you don't need to compute the power of 10 each time from scratch, you just need to compute it based on the previous power of ten (i.e. multiply by 10). Well, assuming you can reverse the input list.
(But the lists you give above are already in reverse order, so arguably you should be able to re-reverse them and just say that your function takes a list of digits in the correct order. If not, then just divide by 10 instead of multiplying by 10.)
This is my take version using foldr:
myTake n list = foldr step [] list
where step x y | (length y) < n = x : y
| otherwise = y
main = do print $ myTake 2 [1,2,3,4]
The output is not what I expect:
[3,4]
I then tried to debug by inserting the length of y into itself and the result was:
[3,2,1,0]
I don't understand why the lengths are inserted in decreasing order. Perhaps something obvious I missed?
If you want to implement take using foldr you need to simulate traversing the list from left to right. The point is to make the folding function depend on an extra argument which encodes the logic you want and not only depend on the folded tail of the list.
take :: Int -> [a] -> [a]
take n xs = foldr step (const []) xs n
where
step x g 0 = []
step x g n = x:g (n-1)
Here, foldr returns a function which takes a numeric argument and traverses the list from left to right taking from it the amount required. This will also work on infinite lists due to laziness. As soon as the extra argument reaches zero, foldr will short-circuit and return an empty list.
foldr will apply the function step starting from the *last elements**. That is,
foldr step [] [1,2,3,4] == 1 `step` (2 `step` (3 `step` (4 `step` [])))
== 1 `step` (2 `step` (3 `step` (4:[])))
== 1 `step` (2 `step (3:4:[])) -- length y == 2 here
== 1 `step` (3:4:[])
== 3:4:[]
== [3, 4]
The lengths are "inserted" in decreasing order because : is a prepending operation. The longer lengths are added to the beginning of the list.
(Image taken from http://en.wikipedia.org/wiki/Fold_%28higher-order_function%29)
*: For simplicity, we assume every operation is strict, which is true in OP's step implementation.
The other answers so far are making it much too complicated, because they seem excessively wedded to the notion that foldr works "from right to left." There is a sense in which it does, but Haskell is a lazy language, so a "right to left" computation that uses a lazy fold step will actually be executed from left to right, as the result is consumed.
Study this code:
take :: Int -> [a] -> [a]
take n xs = foldr step [] (tagFrom 1 xs)
where step (a, i) rest
| i > n = []
| otherwise = a:rest
tagFrom :: Enum i => i -> [a] -> [(a, i)]
tagFrom i xs = zip xs [i..]
Does there exist a equation expander for Haskell?
Something like foldr.com: 1+(1+(1+(1+(…))))=∞
I am new to Haskell I am having trouble understanding why certain equations are more preferable than others. I think it would help if I could see the equations expanded.
For example I found foldr vs foldl difficult to understand at first until I saw them expanded.
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr k z xs = go xs
where
go [] = z
go (y:ys) = y `k` go ys
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z0 xs0 = lgo z0 xs0
where
lgo z [] = z
lgo z (x:xs) = lgo (f z x) xs
From the definitions I can see that foldr expands like this:
foldr (+) 0 [1..1000000] -->
1 + (foldr (+) 0 [2..1000000]) -->
1 + (2 + (foldr (+) 0 [3..1000000])) -->
1 + (2 + (3 + (foldr (+) 0 [4..1000000]))) -->
1 + (2 + (3 + (4 + (foldr (+) 0 [5..1000000])))) -->
and foldl expands like this:
foldl (+) 0 [1..1000000] -->
foldl (+) (foldl (+) 0 [1]) [2..1000000]) -->
foldl (+) (foldl (+) (foldl (+) 0 [1])) [3..1000000]) -->
or from Haskell Wiki on foldr fold foldl':
let z1 = 0 + 1
in foldl (+) z1 [2..1000000] -->
let z1 = 0 + 1
z2 = z1 + 2
in foldl (+) z2 [3..1000000] -->
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
in foldl (+) z3 [4..1000000] -->
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
in foldl (+) z4 [5..1000000] -->
However, I have trouble on larger equations understanding why things work the way they do in Haskell. For example the first sieve function uses 1000 filters while the second sieve function takes only 24 to find the 1001 prime.
primes = sieve [2..]
where
sieve (p:xs) = p : sieve [x | x <- xs, rem x p /= 0]
primes = 2: 3: sieve (tail primes) [5,7..]
where
sieve (p:ps) xs = h ++ sieve ps [x | x <- t, rem x p /= 0]
-- or: filter ((/=0).(`rem`p)) t
where (h,~(_:t)) = span (< p*p) xs
Haskell Wiki on Primes
I have spent a good while working out and expanding by hand. I have come to understand how it works. However, an automated tool to expand certain expressions would greatly improve my understanding of Haskell.
In addition I think it could also serve to help questions that seek to optimize Haskell code:
Optimizing Haskell Code
Help optimize my haskell code - Calculate the sum of all the primes below two million
Is there a tool to expand Haskell expressions?
David V. Thank you for those links. Repr is definitely worth adding to my tool box. I would like to add some additional libraries that I found useful.
HackageDB : Trace (As of December 12, 2010)
ghc-events library and program: Library and tool for parsing .eventlog files from GHC
hood library: Debugging by observing in place
hpc-strobe library: Hpc-generated strobes for a running Haskell program
hpc-tracer program: Tracer with AJAX interface
The Hook package seems to be what I am looking for. I will post more samples later today.
Hood
main = runO ex9
ex9 = print $ observe "foldl (+) 0 [1..4]" foldl (+) 0 [1..4]
outputs
10
-- foldl (+) 0 [1..4]
{ \ { \ 0 1 -> 1
, \ 1 2 -> 3
, \ 3 3 -> 6
, \ 6 4 -> 10
} 0 (1 : 2 : 3 : 4 : [])
-> 10
}
I was unaware of the Hackage library (as I am just getting into Haskell). It reminds me of Perl's CPAN. Thank you for providing those links. This is a great resource.
This is in no way a full reply to your question, but I found a conversation on Haskell-Cafe that have some replies :
http://www.haskell.org/pipermail/haskell-cafe/2010-June/078763.html
That thread links to this package :
http://hackage.haskell.org/package/repr that according to the page "allows you to render overloaded expressions to their textual representation"
The example supplied is :
*Repr> let rd = 1.5 + 2 + (3 + (-4) * (5 - pi / sqrt 6)) :: Repr Double
*Repr> show rd
"fromRational (3 % 2) + 2 + (3 + negate 4 * (5 - pi / sqrt 6))"
This is an answer to an unasked question, think of it as a long comment.
(Please downvote only then below 0, iff you think that it does not fit. I'll remove it then.)
As soon as you are a bit more experienced, you might not want to see the way things expand, anymore. You'll want to understand HOW things work, which then supersedes the question WHY it works; you won't gain much just by observing how it expands, anymore.
The way to analyse the code is much simpler than you might think: Just label every parameter/variable either as "evaluated" or "unevaluated" or "to-be-evaluated", depending on the progression of their causal connections.
Two examples:
1.) fibs
The list of all Fibonacci Numbers is
fibs :: (Num a) => [a]
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
The first two elements are already evaluated; so, label the 3rd element (which has value 2) as to-be-evaluated and all remaining as unevaluated. The 3rd element will then be the (+)-combination of the first elements of fibs and tail fibs, which will be the 1st and 2nd element of fibs, which are already labelled as evaluated. This works with the n-th element to-be-evaluated and the (n-2)-nd and (n-1)-st already evaluated elements respectively.
You can visualize this in different ways, i.e.:
fibs!!(i+0)
+ fibs!!(i+1)
= fibs!!(i+2)
(fibs)
zipWith(+) (tail fibs)
= (drop 2 fibs)
1 : 1 : 2 : 3 ...
(1 :)1 : 2 : 3 : 5 ...
(1 : 1 :)2 : 3 : 5 : 8 ...
2.) Your example "sieve (p:ps) xs"
primes = 2: 3: sieve (tail primes) [5,7..]
where
sieve (p:ps) xs = h ++ sieve ps [x | x <- t, rem x p /= 0]
-- or: filter ((/=0).(`rem`p)) t
where (h,~(_:t)) = span (< p*p) xs
In "sieve (p:ps) xs",
p is evaluated,
ps is unevaluated, and
xs is an evaluated infinite partialy-sieved list (not containing p but containing p²), which you can guess reading the recursion and/or recognizing that the values of h need to be prime.
Sieve should return the list of primes after p, so at least the next prime is to-be-evaluated.
The next prime will be in the list h, which is the list of all (already sieved) numbers k where p < k < p²; h contains only primes because xs does neither contain p nor any number divisible by any prime below p.
t contains all numbers of xs above p². t should be evaluated lazy instead of as soon as possible, because there might not even be the need to evaluate all elements in h. (Only the first element of h is to-be-evaluated.)
The rest of the function definition is the recursion, where the next xs is t with all n*p sieved out.
In the case of foldr, an analysis will show that the "go" is only defined to speed up the runtime, not the readability. Here is an alternative definition, that is easier to analyse:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr (.:) e (x:xs) = x .: (foldr (.:) e xs)
foldr (.:) e [] = e
I've described its functionality here (without analysis).
To train this type of analysing, you might want to read the sources of some standard libraries; i.e. scanl, scanr, unfoldr in the source of Data.List.