Haskell - Garbage collection fails to reclaim sufficient space - haskell

I'm doing a program to sum all odd numbers up to n:
oddSum' n result | n==0 = result
| otherwise = oddSum' (n-1) ((mod n 2)*(n)+result)
oddSum n = oddSum' n 0
I'm getting a two erros for for my inputs (I've put them below), I'm using tail recursion so why is the stack overflow happening? (note: I'm using Hugs on Ubuntu)
oddSum 20000
ERROR - Control stack overflow
oddSum 100000
ERROR - Garbage collection fails to reclaim sufficient space

oddSum 3
oddSum 2 ((2 mod 2)*2 + 3)
oddSum 1 ((1 mod 2)*1 + ((2 mod 2)*2 + 3))
You are building a huge thunk in the result variable.
Once you evaluate this, all the computations have to be done at once, and then the stack overflows, because, to perform addition, for example, you first have to evaluate the operands, and the operands of additions in the operands.
If, otoh, the thunk gets too big, you get a heap overflow.
Try using
result `seq` ((mod n 2) * n + result)
in the recursion.

Firstly, don't use Hugs, it's unsupported. With optimising GHC chances are something like this would be compiled to a tight efficient loop (still your code wouldn't be fine).
Nonstrict accumulators always pose the risk of building up huge thunks. One solution would be to make it strict:
{-# LANGUAGE BangPatterns #-}
oddSum' n !acc | n==0 = acc
| otherwise = oddSum' (n-1) $ (n`mod`2)*n + acc
Of course, that's hardly idiomatic; explicitly writing tail-recursive functions is cumbersome and somewhat frowned upon in Haskell. Most things of this kind can nicely be done with library functions, like
oddSum n = sum [1, 3 .. n]
...which unfortunately doesn't work reliably in constant space, either. It does work with the strict version of the fold (which sum is merely a specialisation of),
import Data.List
oddSum n = foldl' (+) 0 [1, 3 .. n]

Related

Does every Haskell function do tail calls?

I wondered that every function in Haskell should be tail recursive.
The factorial function implemented as a non tail recursive function:
fact 0 = 1
fact n = n * fact (n - 1)
Every operator is a function too, so this is equivalent to
fact 0 = 1
fact n = (*) n (fact (n - 1))
But this is clearly a tail call to me. I wonder why this code causes stack overflows if every call just creates a new thunk on the heap. Shouldn't i get a heap overflow?
In the code
fact 0 = 1
fact n = (*) n (fact (n - 1))
the last (*) ... is a tail call, as you observed. The last argument fact (n-1) however will build a thunk which is immediately demanded by (*). This leads to a non-tail call to fact. Recursively, this will consume the stack.
TL;DR: the posted code performs a tail call, but (*) does not.
(Also "the stack" in Haskell is a not so clear notion as in strict languages. Some implementations of Haskell use something more complex than that. You can search for "push/enter vs eval/apply" strategies if you want some gory details.)

Optimizing longest Collatz chain in Haskell

I've been doing project Euler problems to learn Haskell.
I've have some bumps on the way but managed to get to problem 14.
The question is, which starting number under 1 000 000 produces the longest Collatz chain (numbers are allowed to go above one million after the chain starts).
I've tried a couple of solutions but none of the worked.
I wanted to do a reverse. Starting from 1 and terminating when the number gets above one million but that obviously doesn't work since the terms can go higher than one million.
I've tried memoizing the normal algorithm but again, too large numbers, to much memoization.
I've read that the most obvious solution should work for this but for some reason, my solution takes over 10 seconds to get the maximum up to 20 000. Let alone 1 million.
This is the code I'm using at the moment:
reg_collatz 1 = 1
reg_collatz n
| even n = 1 + reg_collatz (n `div` 2)
| otherwise = 1 + reg_collatz (n * 3 + 1)
solution = foldl1 (\a n -> max a (reg_collatz n)) [1..20000]
Any help is very welcome.
The answer is simple: don’t memoise numbers above one million, but do that with numbers below.
module Main where
import qualified Data.Map as M
import Data.List
import Data.Ord
main = print $ fst $ maximumBy (comparing snd) $ M.toList ansMap
ansMap :: M.Map Integer Int
ansMap = M.fromAscList [(i, collatz i) | i <- [1..1000000]]
where collatz 1 = 0
collatz x = if x' <= 1000000 then 1 + ansMap M.! x'
else 1 + collatz x'
where x' = if even x then x `div` 2 else x*3 + 1
This is obv waaay late but I thought I'd post anyways for future readers' benefit (I imagine OP is long done with this problem).
TL;DR:
I think we probably want to use the Data.Vector package for this problem (and similar types of problems).
Longer version:
According to the Haskell docs, a Map (from Data.Map) has O(log N) access time whereas a Vector (from Data.Vector) has O(1) access; we can see the difference in the results below: the vector implementation runs ~3x faster. (Both are way better than lists which have O(N) access time.)
A couple of benchmarks are included below. The tests were intentionally not run one after another so as to prevent any cache-based optimization.
A couple of observations:
The largest absolute improvement (from the code in the original post) was due to the addition of type signatures; without being explicitly told that the data was of type Int, Haskell's type system was inferring that the data was of type Integer (which is obv bigger and slower)
A bit counterintuitive but, results are virtually indistinguishable between foldl1' and foldl1. (I double checked the code and ran these a few times just to make sure.)
Vector and Array (and, to a certain extent, Map) allow for decent improvement primarily as a result of memoization. (Note that OP's solution is likely a lot faster than a list-based solution that tried to use memoization given lists' O(N) access time.)
Here are a couple of benchmarks (all compiled using O2):
Probably want to look
at these numbers
|
V
Data.Vector 0.35s user 0.10s system 97% cpu 0.468 total
Data.Array (Haskell.org) 0.31s user 0.21s system 98% cpu 0.530 total
Data.Map (above answer) 1.31s user 0.46s system 99% cpu 1.772 total
Control.Parallel (Haskell.org) 1.75s user 0.05s system 99% cpu 1.799 total
OP (`Int` type sigs + foldl') 3.60s user 0.06s system 99% cpu 3.674 total
OP (`Int` type sigs) 3.53s user 0.16s system 99% cpu 3.709 total
OP (verbatim) 3.36s user 4.77s system 99% cpu 8.146 total
Source of figures from Haskell.org: https://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14
The Data.Vector implementation used to generate the above results:
import Data.Vector ( Vector, fromList, maxIndex, (!) )
main :: IO ()
main = putStrLn $ show $ largestCollatz 1000000
largestCollatz :: Int -> Int
largestCollatz n = maxIndex vector
where
vector :: Vector Int
vector = fromList $ 0 : 1 : [collatz x x 0 | x <- [2..n]]
collatz m i c =
case i < m of
True -> c + vector ! i
False -> let j = if even i then i `div` 2 else 3*i + 1
in collatz m j (c+1)

Haskell, memoization, stack overflow

I'm working on problem 14 of Project Euler (http://projecteuler.net/problem=14). I'm trying to use memoization so that I save the length of the sequence for a given number as a partial result. I'm using Data.MemoCombinators for that. The program below produces a stack overflow.
import qualified Data.MemoCombinators as Memo
sL n = seqLength n 1
seqLength = Memo.integral seqLength'
where seqLength' n sum = if (n == 1) then sum
else if (odd n) then seqLength (3*n+1) (sum+1)
else seqLength (n `div` 2) (sum+1)
p14 = snd $ maximum $ zip (map sL numbers) numbers
where numbers = [1..max]
max = 999999
The stack overflow should be due to sum+1 being evaluated lazily. How can I force it to be evaluated before each call to seqLength? BTW, is memoization well implemented? I'm more interested in pointing out my Haskell mistakes than in solving the exercise.
The most common ways of forcing evaluation are to use seq, $! or a bang pattern. However sum+1 is not the culprit here. maximum is. Replacing it with the stricter foldl1' max fixes the stack overflow error.
That taken care of, it turns out that your memoization here is no good. Memo.integral only memoizes the first argument, so you're memoizing partial applications of seqLength', which doesn't really do anything useful. You should get much better results without tail recursion so that you're memoizing the actual results. Also, as luqui points out, arrayRange should be more efficient here:
seqLength = Memo.arrayRange (1, 1000000) seqLength'
where seqLength' 1 = 1
seqLength' n | odd n = 1 + seqLength (3*n+1)
| otherwise = 1 + seqLength (n `div` 2)
I'm not familiar with Data.MemoCombinators, so the generic advice is: try seqLength (3*n+1) $! (sum+1) (the same for even n, of course).
Why use MemoCombinators when we can exploit laziness? The trick is to do something like
seqLength x = lengths !! x - 1
where lengths = map g [1..9999999]
g n | odd n = 1 + seqLength (3 * n + 1)
| otherwise = 1 + seqLength (n `div` 2)
which should work in a memoized way. [Adapted from the non-tail-recursive solution by #hammar]
Of course, then seqLength is O(n) for the memoized case so it suffers less performance. However, this is remediable! We simply take advantage of the fact that Data.Vector is streamed and has O(1) random access. The fromList and map will be done at the same time (as the map will simply produce thunks instead of the actual values because we are using a boxed vector). We also fallback on a non-memoized version since we can't possibly memoize every possible value.
import qualified Data.Vector as V
seqLength x | x < 10000000 = lengths V.! x - 1
| odd x = 1 + seqLength (3 * n + 1)
| otherwise = 1 + seqLength (n `div` 2)
where lengths = V.map g $ V.fromList [1..99999999]
g n | odd n = 1 + seqLength (3 * n + 1)
| otherwise = 1 + seqLength (n `div` 2)
Which should be comparable or better to using MemoCombinators. Don't have haskell on this computer, but if you want to figure out which is better, there's a library called Criterion which is excellent for this sort of thing.
I think using Unboxed Vectors could actually give better performance. It would force everything at once when you evaluate one item (I think) but you need that anyway. Hence you could then just run a foldl' max to get a O(n) solution that should have less constant overhead.
If memory serves, for this problem you don't need memoization at all. Just use foldl' and bang patterns:
snd $ foldl' (\a n-> let k=go n 1 in if fst a < ....
where go n !len | n==1 = ....
Compile with -O2 -XBangPatterns . It's always better to run stadalone as running compiled code in ghci can introduce space leaks.

Haskell Space Overflow

I've compiled this program and am trying to run it.
import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo
collatzLength :: Int -> Int
collatzLength = Memo.arrayRange (1, 1000000) collatzLength'
where
collatzLength' 1 = 1
collatzLength' n | odd n = 1 + collatzLength (3 * n + 1)
| even n = 1 + collatzLength (n `quot` 2)
main = print $ maximumBy (comparing fst) $ [(collatzLength n, n) | n <- [1..1000000]]
I'm getting the following from GHC
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
I assume this is one of the "space overflow" things I've been hearing about. (I'm pretty new to Haskell.) How do I fix it? Do I have to rewrite collatzLength to be tail recursive?
As the author of the code in question, I am now slightly embarrassed because it has not one but two possible stack overflow bugs.
It uses Int. On a 32-bit system this will overflow, as the Collatz sequence can go quite a bit higher than the starting number. This overflow can cause infinite recursion as the function jumps back and forth between negative and positive values.
In the case of numbers between one and a million, the worst starting point is 704511, which goes as high as 56,991,483,520 before coming back down towards 1. This is well outside the 32-bit range.
It uses maximumBy. This function is not strict, so it will cause a stack overflow when used on long lists. One million elements is more than enough for this to happen with the default stack size. It still works with optimizations enabled, though, due to the strictness analysis performed by GHC.
The solution is to use a strict version. Since none is available in the standard libraries, we can use the strict left fold ourselves.
Here is an updated version which should (hopefully) be stack overflow-free, even without optimizations.
import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo
collatzLength :: Integer -> Integer
collatzLength = Memo.arrayRange (1,1000000) collatzLength'
where
collatzLength' 1 = 1
collatzLength' n | odd n = 1 + collatzLength (3 * n + 1)
| even n = 1 + collatzLength (n `quot` 2)
main = print $ foldl1' max $ [(collatzLength n, n) | n <- [1..1000000]]
Here's a shorter program that fails in the same way:
main = print (maximum [0..1000000])
Yep.
$ ghc --make harmless.hs && ./harmless
[1 of 1] Compiling Main ( harmless.hs, harmless.o )
Linking harmless ...
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
With -O2 it works. What do I make of it? I don't know :( These space mysteries are a serious gotcha.
Edit:
Thx to hammar for pointing out the culprit.
Changing your program to use
maximum' = foldl1' max
Makes it work without -O2. The implementation of Prelude's maximum is lazy and so doesn't quite work for long lists without compiler magic dust.
I think it's most likely that you're hitting integer overflow with some of the Collatz sequences, and then ending up in an "artificial" cycle that contains overflows but never hits 1. That would produce an infinite recursion.
Remember that some Collatz sequences get very much larger than their starting number before they finally (?) end up at 1.
Try to see if it fixes your problem to use Integer instead of Int.
Use the optimizer (via the -O2 flag) any time you are concerned about performance. GHC's optimizations are hugely important not just to run time but to stack use. I've tested this with GHC 7.2 and optimization takes care of your issue.
EDIT: In addtion, if you're on a 32 bit machine be sure to use Int64 or Word64. You'll overflow the size of a 32 bit int and cause non-termination otherwise (thanks to Henning for this, upvote his answer).

How do I write a constant-space length function in Haskell?

The canonical implementation of length :: [a] -> Int is:
length [] = 0
length (x:xs) = 1 + length xs
which is very beautiful but suffers from stack overflow as it uses linear space.
The tail-recursive version:
length xs = length' xs 0
where length' [] n = n
length' (x:xs) n = length xs (n + 1)
doesn't suffer from this problem, but I don't understand how this can run in constant space in a lazy language.
Isn't the runtime accumulating numerous (n + 1) thunks as it moves through the list? Shouldn't this function Haskell to consume O(n) space and lead to stack overflow?
(if it matters, I'm using GHC)
Yes, you've run into a common pitfall with accumulating parameters. The usual cure is to force strict evaluation on the accumulating parameter; for this purpose I like the strict application operator $!. If you don't force strictness, GHC's optimizer might decide it's OK for this function to be strict, but it might not. Definitely it's not a thing to rely on—sometimes you want an accumulating parameter to be evaluated lazily and O(N) space is just fine, thank you.
How do I write a constant-space length function in Haskell?
As noted above, use the strict application operator to force evaluation of the accumulating parameter:
clength xs = length' xs 0
where length' [] n = n
length' (x:xs) n = length' xs $! (n + 1)
The type of $! is (a -> b) -> a -> b, and it forces the evaluation of the a before applying the function.
Running your second version in GHCi:
> length [1..1000000]
*** Exception: stack overflow
So to answer your question: Yes, it does suffer from that problem, just as you expect.
However, GHC is smarter than the average compiler; if you compile with optimizations turned out, it'll fix the code for you and make it work in constant space.
More generally, there are ways to force strictness at specific points in Haskell code, preventing the building of deeply nested thunks. A usual example is foldl vs. foldl':
len1 = foldl (\x _ -> x + 1) 0
len2 = foldl' (\x _ -> x + 1) 0
Both functions are left folds that do the "same" thing, except that foldl is lazy while foldl' is strict. The result is that len1 dies with a stack overflow in GHCi, while len2 works correctly.
A tail-recursive function doesn't need to maintain a stack, since the value returned by the function is simply going to be the value returned by the tail call. So instead of creating a new stack frame, the current one gets re-used, with the locals overwritten by the new values passed into the tail call. So every n+1 gets written into the same place where the old n was, and you have constant space usage.
Edit - Actually, as you've written it, you're right, it'll thunk the (n+1)s and cause an overflow. Easy to test, just try length [1..1000000].. You can fix that by forcing it to evaluate it first: length xs $! (n+1), which will then work as I said above.

Resources