I wondered that every function in Haskell should be tail recursive.
The factorial function implemented as a non tail recursive function:
fact 0 = 1
fact n = n * fact (n - 1)
Every operator is a function too, so this is equivalent to
fact 0 = 1
fact n = (*) n (fact (n - 1))
But this is clearly a tail call to me. I wonder why this code causes stack overflows if every call just creates a new thunk on the heap. Shouldn't i get a heap overflow?
In the code
fact 0 = 1
fact n = (*) n (fact (n - 1))
the last (*) ... is a tail call, as you observed. The last argument fact (n-1) however will build a thunk which is immediately demanded by (*). This leads to a non-tail call to fact. Recursively, this will consume the stack.
TL;DR: the posted code performs a tail call, but (*) does not.
(Also "the stack" in Haskell is a not so clear notion as in strict languages. Some implementations of Haskell use something more complex than that. You can search for "push/enter vs eval/apply" strategies if you want some gory details.)
Related
I'm concerned about efficiency in Haskell's lazy evaluation.
consider following code
main = print $ x + x
where x = head [1..]
here, x first hold the expression of head [1..] instead of the result 1, due to the laziness,
but then when I call x + x, will the expression head [1..] be executed twice?
I found the following description on haskell.org
Lazy evaluation, on the other hand, means only evaluating an expression when its results are needed (note the shift from "reduction" to "evaluation"). So when the evaluation engine sees an expression it builds a thunk data structure containing whatever values are needed to evaluate the expression, plus a pointer to the expression itself. When the result is actually needed the evaluation engine calls the expression and then replaces the thunk with the result for future reference.
So does this mean that, in x + x, when calling the first x, head [1..] is executed and x is re-assigned to 1, and the second x is just calling a reference of it?
Did I understand this right?
This is more of a question about particular Haskell implementations than about Haskell itself, since the language makes no particular guarantees about how things are evaluated.
But in GHC (and most other implementations, as far as I'm aware): yes, when thunks are evaluated they are replaced by the result internally, so other references to the same thunk benefit from the work done evaluating it the first time.
The caveat is that there are no real guarantees about which expressions end up implemented as references to the same thunk. The compiler is in general allowed to make whatever transformations to your code it likes so long as the result is the same. Of course, the reason to implement code transformations in a compiler is usually to try to make the code faster, so it's hopefully not likely to rewrite things in such a way as to make it worse, but it can never be perfect.
In practice though, you're usually pretty safe assuming that whenever you give an expression a name (as in where x = head [1..]), then all uses of that name (within the scope of the binding) will be references to a single thunk.
At first, x is just a thunk. You can see that as follows:
λ Prelude> let x = head [1..]
λ Prelude> :sprint x
x = _
Here the _ indicates that x has not yet been evaluated. Its mere definition is recorded.
Then, you can understand how x + x is constructed by just realizing that x is a pointer to this thunk: both those x will point to the same thunk. Once one is evaluated, the other is, since it's the same thunk.
You can see that with ghc-vis:
λ Prelude> :vis
λ Prelude> :view x
λ Prelude> :view x + x
should show you something along the lines of:
Here you can see that the x + x thunk actually points twice to the x thunk.
Now, if you evaluate x, by printing it for example:
λ Prelude> print x
You'll obtain:
You can see here that the x thunk is no longer a thunk: it's the value 1.
There are two ways to evaluate an expression:
Lazy (evaluate outermost first).
Strict (evaluate innermost first).
Consider the following function:
select x y z = if x > z then x else y
Now let's call it:
select (2 + 3) (3 + 4) (1 + 2)
How will this be evaluated?
Strict evaluation: Evaluate innermost first.
select (2 + 3) (3 + 4) (1 + 2)
select 5 (3 + 4) (1 + 2)
select 5 7 (1 + 2)
select 5 7 3
if 5 > 3 then 5 else 7
if True then 5 else 7
5
Strict evaluation took 6 reductions. To evaluate select we first had to evaluate its arguments. In strict evaluation the arguments to a function are always fully evaluated. Hence functions are "call by value". Thus there's no extra bookkeeping.
Lazy evaluation: Evaluate outermost first.
select (2 + 3) (3 + 4) (1 + 2)
if (2 + 3) > (1 + 2) then (2 + 3) else (3 + 4)
if 5 > (1 + 2) then 5 else (3 + 4)
if 5 > 3 then 5 else (3 + 4)
if True then 5 else (3 + 4)
5
Lazy evaluation only took 5 reductions. We never used (3 + 4) and hence we never evaluated it. In lazy evaluation we can evaluate a function without evaluating its arguments. The arguments are only evaluated when needed. Hence functions are "call by need".
However "call by need" evaluation strategies need extra bookkeeping - you need to keep a track of whether an expression has been evaluated. In the above expression when we evaluate x = (2 + 3) we don't need to evaluate it again. However we do need to keep a track of whether it was evaluated.
Haskell supports both strict and lazy evaluation. However it supports lazy evaluation by default. To enable strict evaluation you would have to use the special seq and deepSeq functions.
Similarly you can have lazy evaluation in strict languages like JavaScript. However you would need to keep a track of whether an expression has been evaluated or not. You could research about implementing thunks in JavaScript or similar languages.
I'm doing a program to sum all odd numbers up to n:
oddSum' n result | n==0 = result
| otherwise = oddSum' (n-1) ((mod n 2)*(n)+result)
oddSum n = oddSum' n 0
I'm getting a two erros for for my inputs (I've put them below), I'm using tail recursion so why is the stack overflow happening? (note: I'm using Hugs on Ubuntu)
oddSum 20000
ERROR - Control stack overflow
oddSum 100000
ERROR - Garbage collection fails to reclaim sufficient space
oddSum 3
oddSum 2 ((2 mod 2)*2 + 3)
oddSum 1 ((1 mod 2)*1 + ((2 mod 2)*2 + 3))
You are building a huge thunk in the result variable.
Once you evaluate this, all the computations have to be done at once, and then the stack overflows, because, to perform addition, for example, you first have to evaluate the operands, and the operands of additions in the operands.
If, otoh, the thunk gets too big, you get a heap overflow.
Try using
result `seq` ((mod n 2) * n + result)
in the recursion.
Firstly, don't use Hugs, it's unsupported. With optimising GHC chances are something like this would be compiled to a tight efficient loop (still your code wouldn't be fine).
Nonstrict accumulators always pose the risk of building up huge thunks. One solution would be to make it strict:
{-# LANGUAGE BangPatterns #-}
oddSum' n !acc | n==0 = acc
| otherwise = oddSum' (n-1) $ (n`mod`2)*n + acc
Of course, that's hardly idiomatic; explicitly writing tail-recursive functions is cumbersome and somewhat frowned upon in Haskell. Most things of this kind can nicely be done with library functions, like
oddSum n = sum [1, 3 .. n]
...which unfortunately doesn't work reliably in constant space, either. It does work with the strict version of the fold (which sum is merely a specialisation of),
import Data.List
oddSum n = foldl' (+) 0 [1, 3 .. n]
I am new to haskell and just learning the fun of functional programming. but have run into trouble right away with an implementation of the fibonacci function. Please find the code below.
--fibonacci :: Num -> [Num]
fibonacci 1 = [1]
fibonacci 2 = [1,1]
--fibonacci 3 = [2]
--fibonacci n = fibonacci n-1
fibonacci n = fibonacci (n-1) ++ [last(fibonacci (n-1)) + last(fibonacci (n-2))]
Rather awkward, I know. I can't find time to look up and write a better one. Though I wonder what makes this so inefficient. I know I should look it up, just hoping someone would feel the need to be pedagogic and spare me the effort.
orangegoat's answer and Sec Oe's answer contain a link to probably the best place to learn how to properly write the fibonacci sequence in Haskell, but here's some reasons why your code is inefficient (note, your code is not that different from the classic naive definition. Elegant? Sure. Efficient? Goodness, no):
Let's consider what happens when you call
fibonacci 5
That expands into
(fibonacci 4) ++ [(last (fibonacci 4)) + (last (fibonacci 3))]
In addition to concatenating two lists together with ++, we can already see that one place we're being inefficient is that we calculate fibonacci 4 twice (the two places we called fibonacci (n-1). But it gets worst.
Everywhere it says fibonacci 4, that expands into
(fibonacci 3) ++ [(last (fibonacci 3)) + (last (fibonacci 2))]
And everywhere it says fibonacci 3, that expands into
(fibonacci 2) ++ [(last (fibonacci 2)) + (last (fibonacci 1))]
Clearly, this naive definition has a lot of repeated computations, and it only gets worse when n gets bigger and bigger (say, 1000). fibonacci is not a list, it just returns lists, so it isn't going to magically memoize the results of the previous computations.
Additionally, by using last, you have to navigate through the list to get its last element, which adds on top of the problems with this recursive definition (remember, lists in Haskell don't support constant time random access--they aren't dynamic arrays, they are linked lists).
One example of a recursive definition (from the links mentioned) that does keep down on the computations is this:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
Here, fibs is actually a list, and we can take advantage of Haskell's lazy evaluation to generate fibs and tail fibs as needed, while the previous computations are still stored inside of fibs. And to get the first five numbers, it's as simple as:
take 5 fibs -- [0,1,1,2,3]
(Optionally, you can replace the first 0 with a 1 if you want the sequence to start at 1).
All the ways to implement the fibonacci sequence in Haskell just follow the link
http://www.haskell.org/haskellwiki/The_Fibonacci_sequence
This implementation is inefficient because it makes three recursive calls. If we were to write a recurrence relation for computing fibonacci n to a normal form (note, pedantic readers: not whnf), it would look like:
T(1) = c
T(2) = c'
T(n) = T(n-1) + T(n-1) + T(n-2) + c''
(Here c, c', and c'' are some constants that we don't know.) Here's a recurrence which is smaller:
S(1) = min(c, c')
S(n) = 2 * S(n-1)
...but this recurrence has a nice easy closed form, namely S(n) = min(c, c') * 2^(n-1): it's exponential! Bad news.
I like the general idea of your implementation (that is, track the second-to-last and last terms of the sequence together), but you fell down by recursively calling fibonacci multiple times, when that's totally unnecessary. Here's a version that fixes that mistake:
fibonacci 1 = [1]
fibonacci 2 = [1,1]
fibonacci n = case fibonacci (n-1) of
all#(last:secondLast:_) -> (last + secondLast) : all
This version should be significantly faster. As an optimization, it produces the list in reverse order, but the most important optimization here was making only one recursive call, not building the list efficiently.
So even if you wouldn't know about the more efficient ways, how could you improve your solution?
First, looking at the signature it seems you don't want an infinite list, but a list of a given length. That's fine, the infinite stuff might be too crazy for you right now.
The second observation is that you need to access the end of the list quite often in your version, which is bad. So here is a trick which is often useful when working with lists: Write a version that work backwards:
fibRev 0 = []
fibRev 1 = [1]
fibRev 2 = [1,1]
fibRev n = let zs#(x:y:_) = fibRev (n-1) in (x+y) : zs
Here is how the last case works: We get the list which is one element shorter and call it zs. At the same time we match against the pattern (x:y:_) (this use of # is called an as-pattern). This gives us the first two elements of that list. To calculate the next value of the sequence, we have just to add these elements. We just put the sum (x+y) in front of the list zs we already got.
Now we have the fibonacci list, but it is backwards. No problem, just use reverse:
fibonacci :: Int -> [Int]
fibonacci n = reverse (fibRev n)
The reverse function isn't that expensive, and we call it here only one time.
I have a function
myLength = foldl (\ x _ -> x + 1) 0
which fails with stack overflow with input around 10^6 elements (myLength [1..1000000] fails). I believe that is due to the thunk build up since when I replace foldl with foldl', it works.
So far so good.
But now I have another function to reverse a list :
myReverse = foldl (\ acc x -> x : acc) []
which uses the lazy version foldl (instead of foldl')
When I do
myLength . myReverse $ [1..1000000].
This time it works fine. I fail to understand why foldl works for the later case and not for former?
To clarify here myLength uses foldl' while myReverse uses foldl
Here's my best guess, though I'm no expert on Haskell internals (yet).
While building the thunk, Haskell allocates all the intermediate accumulator variables on the heap.
When performing the addition as in myLength, it needs to use the stack for intermediate variables. See this page. Excerpt:
The problem starts when we finally evaluate z1000000:
Note that z1000000 = z999999 +
1000000. So 1000000 is pushed on the stack. Then z999999 is evaluated.
Note that z999999 = z999998 + 999999.
So 999999 is pushed on the stack. Then
z999998 is evaluated:
Note that z999998 = z999997 + 999998.
So 999998 is pushed on the stack. Then
z999997 is evaluated:
However, when performing list construction, here's what I think happens (this is where the guesswork begins):
When evaluating z1000000:
Note that z1000000 = 1000000 :
z999999. So 1000000 is stored inside
z1000000, along with a link (pointer)
to z999999. Then z999999 is evaluated.
Note that z999999 = 999999 : z999998.
So 999999 is stored inside z999999,
along with a link to z999998. Then
z999998 is evaluated.
etc.
Note that z999999, z999998 etc. changing from a not-yet-evaluated expression into a single list item is an everyday Haskell thing :)
Since z1000000, z999999, z999998, etc. are all on the heap, these operations don't use any stack space. QED.
The canonical implementation of length :: [a] -> Int is:
length [] = 0
length (x:xs) = 1 + length xs
which is very beautiful but suffers from stack overflow as it uses linear space.
The tail-recursive version:
length xs = length' xs 0
where length' [] n = n
length' (x:xs) n = length xs (n + 1)
doesn't suffer from this problem, but I don't understand how this can run in constant space in a lazy language.
Isn't the runtime accumulating numerous (n + 1) thunks as it moves through the list? Shouldn't this function Haskell to consume O(n) space and lead to stack overflow?
(if it matters, I'm using GHC)
Yes, you've run into a common pitfall with accumulating parameters. The usual cure is to force strict evaluation on the accumulating parameter; for this purpose I like the strict application operator $!. If you don't force strictness, GHC's optimizer might decide it's OK for this function to be strict, but it might not. Definitely it's not a thing to rely on—sometimes you want an accumulating parameter to be evaluated lazily and O(N) space is just fine, thank you.
How do I write a constant-space length function in Haskell?
As noted above, use the strict application operator to force evaluation of the accumulating parameter:
clength xs = length' xs 0
where length' [] n = n
length' (x:xs) n = length' xs $! (n + 1)
The type of $! is (a -> b) -> a -> b, and it forces the evaluation of the a before applying the function.
Running your second version in GHCi:
> length [1..1000000]
*** Exception: stack overflow
So to answer your question: Yes, it does suffer from that problem, just as you expect.
However, GHC is smarter than the average compiler; if you compile with optimizations turned out, it'll fix the code for you and make it work in constant space.
More generally, there are ways to force strictness at specific points in Haskell code, preventing the building of deeply nested thunks. A usual example is foldl vs. foldl':
len1 = foldl (\x _ -> x + 1) 0
len2 = foldl' (\x _ -> x + 1) 0
Both functions are left folds that do the "same" thing, except that foldl is lazy while foldl' is strict. The result is that len1 dies with a stack overflow in GHCi, while len2 works correctly.
A tail-recursive function doesn't need to maintain a stack, since the value returned by the function is simply going to be the value returned by the tail call. So instead of creating a new stack frame, the current one gets re-used, with the locals overwritten by the new values passed into the tail call. So every n+1 gets written into the same place where the old n was, and you have constant space usage.
Edit - Actually, as you've written it, you're right, it'll thunk the (n+1)s and cause an overflow. Easy to test, just try length [1..1000000].. You can fix that by forcing it to evaluate it first: length xs $! (n+1), which will then work as I said above.