Haskell, memoization, stack overflow - haskell

I'm working on problem 14 of Project Euler (http://projecteuler.net/problem=14). I'm trying to use memoization so that I save the length of the sequence for a given number as a partial result. I'm using Data.MemoCombinators for that. The program below produces a stack overflow.
import qualified Data.MemoCombinators as Memo
sL n = seqLength n 1
seqLength = Memo.integral seqLength'
where seqLength' n sum = if (n == 1) then sum
else if (odd n) then seqLength (3*n+1) (sum+1)
else seqLength (n `div` 2) (sum+1)
p14 = snd $ maximum $ zip (map sL numbers) numbers
where numbers = [1..max]
max = 999999
The stack overflow should be due to sum+1 being evaluated lazily. How can I force it to be evaluated before each call to seqLength? BTW, is memoization well implemented? I'm more interested in pointing out my Haskell mistakes than in solving the exercise.

The most common ways of forcing evaluation are to use seq, $! or a bang pattern. However sum+1 is not the culprit here. maximum is. Replacing it with the stricter foldl1' max fixes the stack overflow error.
That taken care of, it turns out that your memoization here is no good. Memo.integral only memoizes the first argument, so you're memoizing partial applications of seqLength', which doesn't really do anything useful. You should get much better results without tail recursion so that you're memoizing the actual results. Also, as luqui points out, arrayRange should be more efficient here:
seqLength = Memo.arrayRange (1, 1000000) seqLength'
where seqLength' 1 = 1
seqLength' n | odd n = 1 + seqLength (3*n+1)
| otherwise = 1 + seqLength (n `div` 2)

I'm not familiar with Data.MemoCombinators, so the generic advice is: try seqLength (3*n+1) $! (sum+1) (the same for even n, of course).

Why use MemoCombinators when we can exploit laziness? The trick is to do something like
seqLength x = lengths !! x - 1
where lengths = map g [1..9999999]
g n | odd n = 1 + seqLength (3 * n + 1)
| otherwise = 1 + seqLength (n `div` 2)
which should work in a memoized way. [Adapted from the non-tail-recursive solution by #hammar]
Of course, then seqLength is O(n) for the memoized case so it suffers less performance. However, this is remediable! We simply take advantage of the fact that Data.Vector is streamed and has O(1) random access. The fromList and map will be done at the same time (as the map will simply produce thunks instead of the actual values because we are using a boxed vector). We also fallback on a non-memoized version since we can't possibly memoize every possible value.
import qualified Data.Vector as V
seqLength x | x < 10000000 = lengths V.! x - 1
| odd x = 1 + seqLength (3 * n + 1)
| otherwise = 1 + seqLength (n `div` 2)
where lengths = V.map g $ V.fromList [1..99999999]
g n | odd n = 1 + seqLength (3 * n + 1)
| otherwise = 1 + seqLength (n `div` 2)
Which should be comparable or better to using MemoCombinators. Don't have haskell on this computer, but if you want to figure out which is better, there's a library called Criterion which is excellent for this sort of thing.
I think using Unboxed Vectors could actually give better performance. It would force everything at once when you evaluate one item (I think) but you need that anyway. Hence you could then just run a foldl' max to get a O(n) solution that should have less constant overhead.

If memory serves, for this problem you don't need memoization at all. Just use foldl' and bang patterns:
snd $ foldl' (\a n-> let k=go n 1 in if fst a < ....
where go n !len | n==1 = ....
Compile with -O2 -XBangPatterns . It's always better to run stadalone as running compiled code in ghci can introduce space leaks.

Related

Using non-deterministic list monad to find long Collatz sequences

I wrote the following code to solve Project Euler's No. 14:
The following iterative (Collatz) sequence is defined for the set of positive integers:
n → n/2 (n is even)
n → 3n + 1 (n is odd)
Q: Which starting number, under one million, produces the longest chain?
And my code:
collatz :: Integer -> [Integer]
collatz 1 = [1]
collatz n =
filter (< 1000000) prev >>= poss
where prev = collatz (n - 1)
poss :: Integer -> [Integer]
poss prev
| even prev && prev `mod` 3 == 1 && (prev - 1) `div` 3 > 1 = [2 * prev, (prev - 1) `div` 3]
| otherwise = [2 * prev]
Where collatz n returns a list of numbers that will generate a Collatz chain of length n. The problem is, I can only either not restrict the result or restrict the whole chain, instead of only the seed number, to be under 1000,000. Is it possible to use this model to solve the problem at all?
I think that this approach - while interesting - is fundamentally doomed. Suppose I discover that all the seeds which result in a chain of length 500 are above 2,000,000. How can I know that I won't find that in three more steps there's a seed under 1,000,000 that gets me there? I see no way to know when you're done.
The only viable approach I see to this problem is to compute the collatz length for every number from 1 to 999,999 and then do something like:
main :: IO ()
main = do
let collatzMax = maximumBy (compare `on` collatzLength) [1..999999]
print collatzMax
On the other hand, this provides a great opportunity to learn about CAFs since the function collatzLength could be naively defined as:
collatzLength 1 = 1
collatzLength n | n `mod` 2 == 0 = 1 + collatzLength (n `div` 2)
collatzLength n = 1 + collatzLength (3 * n + 1)
And that kind of recursion screams out for a CAF.
Sure, there are memoization modules that will go and build the CAF for you, but building one yourself is a useful exercise. It's a whole little mini-course in lazy infinitely-recursive data structures.
If that defeats you, you can glance at this spoiler of how to use a CAF and then rewrite it using a different data structure. (what about a 10-way tree instead of a binary tree? What about traversing the tree in a different order? Can you remove the call to showIntAtBase?)
Your idea is interesting, although not the most efficient one. It could be worth trying, although it'll be probably memory intensive. Some thoughts:
As some chains can go over 1000000, so you can't just filter out everything less in collatz. You need to keep all the numbers in each pass.
Calling collatz this way is inefficient, as it computes the sets all over again. Making it an infinite list that shares values would be more efficient:
collatz :: [[Integer]]
collatz = [1] : map (>>= poss) collatz
You need to figure out when you're done. For this you'd need to go through the number lists generated by collatz and count how many of them are below 1000000. When you have seen all the numbers below the limit, the last list will contain the numbers with the longest chain.
That said, I'm afraid this approach isn't computationally feasible. In particular, you'll generate exponentially many numbers and exponentially large ones. For example, if the longest chain would be 500, the result of collatz in that step would contain numbers up to 2^500. And as mentioned, there is no way to tell which of these huge numbers might be the one leading to the solution, so you can't just discard them.

ruby while loop translated to haskell

I've just started learning a bit of Haskell and functional programming, but I find it very difficult getting a hang of it :)
I am trying to translate a small piece of ruby code to Haskell (because I like the concept functional programming and Haskell proposes and even more because I come from a mathematics field and Haskell seems very mathematical):
class Integer
def factorial
f = 1; for i in 1..self; f *= i; end; f
end
end
boundary = 1000
m = 0
# Brown Numbers - pair of integers (m,n) where n factorial is equal with square root of m
while m <= boundary
n = 0
while n <= boundary
puts "(#{m},#{n})" if ((n.factorial + 1) == (m ** 2))
n += 1
end
m += 1
end
I could only figure out how to do factorials:
let factorial n = product [1..n]
I cannot figure out how to do the while loops or equivalent in Haskell, even though I found some examples that were far to confusing for me.
The idea is that the loops start from 0 (or 1) and continue (with an increment of 1) until it reaches a boundary (in my code is 1000). The reason there is a boundary is because I was thinking of starting parallel tasks that do the same operation but on different intervals so the results that I expect are returned faster (one operation would be done on 1 to 10000, another on 10000 to 100000, etc.).
I would really appreciate it if anyone could help out with this :)
Try this:
let results = [(x,y) | x <- [1..1000], y <- [1..1000] ,1 + fac x == y*y]
where fac n = product [1..n]
This is a list comprehension. More on that here.
To map it to your Ruby code,
The nested loops in m and n are replaced with x and y. Basically there is iteration over the values of x and y in the specified ranges (1 to 1000 inclusive in this case).
The check at the end is your filter condition for getting Brown numbers.
where allows us to create a helper function to calculate the factorial.
Note that instead of a separate function, we could have computed the factorial in place, like so:
(1 + product[1..x]) == y * y
Ultimately, the (x,y) on the left side means that it returns a list of tuples (x,y) which are your Brown numbers.
OK, this should work in your .hs file:
results :: [(Integer, Integer)] --Use instead of `Int` to fix overflow issue
results = [(x,y) | x <- [1..1000], y <- [1..1000] , fac x == y*y]
where fac n = product [1..n]
To add to shree.pat18's answer, maybe an exercise you could try is to translate the Haskell solution back into Ruby. It should be possible, because Ruby has ranges, Enumerator::Lazy and Enumerable#flat_map. The following rewritten Haskell solution should perhaps help:
import Data.List (concatMap)
results :: [(Integer, Integer)]
results = concatMap (\x -> concatMap (\y -> test x y) [1..1000]) [1..1000]
where test x y = if fac x == y*y then [(x,y)] else []
fac n = product [1..n]
Note that Haskell concatMap is more or less the same as Ruby Enumerable#flat_map.

Haskell Mini Function Implementation

I have been trying to take this function and make small implementation using iterate and takeWhile. It doesn't have to be using those functions, really I'm just trying to turn it into one line. I can see the pattern in it, but I can't seem to exploit it without basically making the same code, just with iterate instead of recursion.
fun2 :: Integer -> Integer
fun2 1 = 0
fun2 n
| even n = n + fun2 (n `div` 2)
| otherwise = fun2 (3 * n + 1)
Any help would be great. I've been struggling with this for hours.
Thanks
If you want to do this with iterate, the key is to chop it up into smaller logical parts:
generate a sequence using the rule
ak+1 = ak/2 if ak even
ak+1 = 3ak+1 if ak odd
stop the sequence at aj = 1 (which all do, if the collatz conjecture is true).
filter out the even elements on the path
sum them
So then this becomes:
f = sum . filter even . takeWhile (>1) . iterate (\n -> if even n then n `div` 2 else 3*n + 1)
However, I do think this would be clearer with a helper function
f = sum . filter even . takeWhile (>1) . iterate collatz
where collatz n | even n = n `div` 2
| otherwise = 3*n + 1
This may save you no lines, but transforms your recursion into the generation of data.
Firstly, I agree with tom's comment that there is nothing wrong with your four line version. It's perfectly readable. However, it's occasionally a fun exercise to turn Haskell functions into one liners. Who knows, you might learn something!
At the moment you have
fun 1 = 0
fun n | even n = n + fun (n `div` 2)
| otherwise = fun (3 * n + 1)
You can always convert an expression using guards into an if
fun 1 = 0
fun n = if even n then n + fun (n `div` 2) else fun (3 * n + 1)
You can always convert a series of pattern matches into a case expression:
fun n = case n of
1 -> 0
_ -> if even n then n + fun (n `div` 2) else fun (3 * n + 1)
And finally, you can convert a case expression into a chain of ifs (actually, in general this would require an Eq instance for the argument of your function, but since you're using Integers it doesn't matter).
fun n = if n == 1 then 0 else if even n then n + fun (n `div` 2) else fun (3 * n + 1)
I think you'll agree that this is far less readable than what you started out with.
One liner ;)
fun2 n = if n==1 then 0 else if even n then n + fun2 (n `div` 2) else fun2 (3 * n + 1)
My sense is that short of look-up tables, this function cannot be implemented without recursion because the argument passed in the recursion seems unpredictable (except for n as powers of 2).
On the other hand, rampion helped me learn something new.

Haskell - Garbage collection fails to reclaim sufficient space

I'm doing a program to sum all odd numbers up to n:
oddSum' n result | n==0 = result
| otherwise = oddSum' (n-1) ((mod n 2)*(n)+result)
oddSum n = oddSum' n 0
I'm getting a two erros for for my inputs (I've put them below), I'm using tail recursion so why is the stack overflow happening? (note: I'm using Hugs on Ubuntu)
oddSum 20000
ERROR - Control stack overflow
oddSum 100000
ERROR - Garbage collection fails to reclaim sufficient space
oddSum 3
oddSum 2 ((2 mod 2)*2 + 3)
oddSum 1 ((1 mod 2)*1 + ((2 mod 2)*2 + 3))
You are building a huge thunk in the result variable.
Once you evaluate this, all the computations have to be done at once, and then the stack overflows, because, to perform addition, for example, you first have to evaluate the operands, and the operands of additions in the operands.
If, otoh, the thunk gets too big, you get a heap overflow.
Try using
result `seq` ((mod n 2) * n + result)
in the recursion.
Firstly, don't use Hugs, it's unsupported. With optimising GHC chances are something like this would be compiled to a tight efficient loop (still your code wouldn't be fine).
Nonstrict accumulators always pose the risk of building up huge thunks. One solution would be to make it strict:
{-# LANGUAGE BangPatterns #-}
oddSum' n !acc | n==0 = acc
| otherwise = oddSum' (n-1) $ (n`mod`2)*n + acc
Of course, that's hardly idiomatic; explicitly writing tail-recursive functions is cumbersome and somewhat frowned upon in Haskell. Most things of this kind can nicely be done with library functions, like
oddSum n = sum [1, 3 .. n]
...which unfortunately doesn't work reliably in constant space, either. It does work with the strict version of the fold (which sum is merely a specialisation of),
import Data.List
oddSum n = foldl' (+) 0 [1, 3 .. n]

Stack space overflow error while running recursive compiled code. Fine tune algorithm, or give more resources?

I am working on project Euler #14, and have a solution to get the answer, but am getting a stack space overflow error when I try to run the code. The algorithm works OK in the interactive GHCI (on low numbers), but wont work when I throw a really big number at it and try to compile it.
Here is a rough idea of what it does in the interactive GHCI. It takes about 10 seconds to calculate "answer 50000" on my computer.
After letting GHCI run the problem for a few minutes, it spits out the correct answer.
*Euler System.IO> answer 1000000
(525,837799)
But that doesn't solve the stack overflow error when compiling the program to run natively.
*Euler System.IO> answer 10
(20,9)
*Euler System.IO> answer 100
(119,97)
*Euler System.IO> answer 1000
(179,871)
*Euler System.IO> answer 10000
(262,6171)
*Euler System.IO> answer 50000
(324,35655)
What should I do to get the answer to for "answer 1000000"? I imagine my algorithm needs to be fine tuned a bit, but I have no idea how to go about doing that.
Code:
module Main
where
import System.IO
import Control.Monad
main = print (answer 1000000)
-- Count the length of the sequences
-- count' creates a tuple with the second value
-- being the starting number of the game
-- and the first value being the total
-- length of the chain
count' n = (cSeq n, n)
cSeq n = length $ game n
-- Find the maximum chain value of the game
answer n = maximum $ map count' [1..n]
-- Working game.
-- game 13 = [13,40,20,10,5,16,8,4,2,1]
game n = n : play n
play x
| x <= 0 = [] -- is negative or 0
| x == 1 = [] -- is 1
| even x = doEven x : play ((doEven x)) -- even
| otherwise = doOdd x : play ((doOdd x)) -- odd
where doOdd x = (3 * x) + 1
doEven x = (x `div` 2)
The problem here is that maximum is too lazy. Instead of keeping track of the largest element as it goes along, it builds up a huge tree of max thunks. This is because maximum is defined in terms of foldl, so the evaluation goes as follows:
maximum [1, 2, 3, 4, 5]
foldl max 1 [2, 3, 4, 5]
foldl max (max 1 2) [3, 4, 5]
foldl max (max (max 1 2) 3) [4, 5]
foldl max (max (max (max 1 2) 3) 4) [5]
foldl max (max (max (max (max 1 2) 3) 4) 5) []
max (max (max (max 1 2) 3) 4) 5 -- this expression will be huge for large lists
Trying to evaluate too many of these nested max calls causes a stack overflow.
The solution is to force it to evaluate these as it goes along by using the strict version foldl', (or, in this case, its cousin foldl1'). This prevents the max's from building up by reducing them at each step:
foldl1' max [1, 2, 3, 4, 5]
foldl' max 1 [2, 3, 4, 5]
foldl' max 2 [3, 4, 5]
foldl' max 3 [4, 5]
foldl' max 4 [5]
foldl' max 5 []
5
GHC can often solve these kinds of problems on its own if you compile with -O2 which (among other things) runs a strictness analysis of your program. However, I think it's good practice to write programs that don't need to rely on optimizations to work.
Note: After fixing this, the resulting program is still very slow. You might want to look into using memoization for this problem.
#hammar already pointed out the problem that maximum is too lazy, and how to resolve that (using foldl1', the strict version of foldl1).
But there are further inefficiencies in the code.
cSeq n = length $ game n
cSeq lets game construct a list, only to calculate its length. Unfortunately, length is not a "good consumer", so the construction of the intermediate list is not fused away. That's quite a bit of unnecessary allocation and costs time. Eliminating these lists
cSeq n = coll (1 :: Int) n
where
coll acc 1 = acc
coll acc m
| even m = coll (acc + 1) (m `div` 2)
| otherwise = coll (acc + 1) (3*m+1)
cuts down the allocation by something like 65% and the running time by about 20% (still slow). Next point, you're using div, which performs a sign check in addition to the normal division. Since all numbers involved are positive, using quot instead does speed it up a bit more (not much here, but it will become important later).
The next big point is that, since you haven't given type signatures, the type of the numbers (except where it was determined by the use of length or by the expression type signature (1 :: Int) in my rewrite) is Integer. The operations on Integer are considerably slower than the corresponding operations on Int, so if possible, you should use Int (or Word) rather than Integer when speed matters. If you have a 64-bit GHC, Int is sufficient for these computations, that reduces the running time by about half when using div, by about 70% when using quot, when using the native code generator, and when using the LLVM backend, the running time is reduced by about 70% when using div and by about 95% when using quot.
The difference between the native code generator and the LLVM backend is mostly due to some elementary low-level optimisations.
even and odd are defined
even, odd :: (Integral a) => a -> Bool
even n = n `rem` 2 == 0
odd = not . even
in GHC.Real. When the type is Int, LLVM knows to replace the division by 2 used to determine the modulus with a bitwise and (n .&. 1 == 0). The native code generator does not (yet) do many of these low-level optimisations. If you do that by hand, the code produced by the NCG and the LLVM backend performs nearly identically.
When using div, both, the NCG and LLVM, are not able to replace the division with a short shift-and-add sequence, so you get the relatively slow machine division instruction with the sign-test. With quot, both are able to do that for Int, so you get much faster code.
The knowledge that all occurring numbers are positive allows us to replace the division by 2 with a simple right shift, without any code to correct for negative arguments, that speeds up the code produced by the LLVM backend by another ~33%, oddly it doesn't make a difference for the NCG.
So from the original that took eight second plus/minus a bit (a little less with the NCG, a little more with the LLVM backend), we've gone to
module Main (main)
where
import Data.List
import Data.Bits
main = print (answer (1000000 :: Int))
-- Count the length of the sequences
-- count' creates a tuple with the second value
-- being the starting number of the game
-- and the first value being the total
-- length of the chain
count' n = (cSeq n, n)
cSeq n = go (1 :: Int) n
where
go !acc 1 = acc
go acc m
| even' m = go (acc+1) (m `shiftR` 1)
| otherwise = go (acc+1) (3*m+1)
even' :: Int -> Bool
even' m = m .&. 1 == 0
-- Find the maximum chain value of the game
answer n = foldl1' max $ map count' [1..n]
which takes 0.37 seconds with the NCG, and 0.27 seconds with the LLVM backend on my setup.
A minute improvement in running time, but a huge reduction of allocation can be obtained by replacing the foldl1' max with a manual recursion,
answer n = go 1 1 2
where
go ml mi i
| n < i = (ml,mi)
| l > ml = go l i (i+1)
| otherwise = go ml mi (i+1)
where
l = cSeq i
that makes it 0.35 resp. 0.25 seconds (and produces a tiny 52,936 bytes allocated in the heap).
Now if that is still too slow, you can worry about a good memoisation strategy. The best I know(1) is to use an unboxed array to store the chain lengths for the numbers not exceeding the limit,
{-# LANGUAGE BangPatterns #-}
module Main (main) where
import System.Environment (getArgs)
import Data.Array.ST
import Data.Array.Base
import Control.Monad.ST
import Data.Bits
main :: IO ()
main = do
args <- getArgs
let bd = case args of
a:_ -> read a
_ -> 100000
print $ mxColl bd
mxColl :: Int -> (Int,Int)
mxColl bd = runST $ do
arr <- newArray (0,bd) 0
unsafeWrite arr 1 1
goColl arr bd 1 1 2
goColl :: STUArray s Int Int -> Int -> Int -> Int -> Int -> ST s (Int,Int)
goColl arr bd ms ml i
| bd < i = return (ms,ml)
| otherwise = do
nln <- collatzLength arr bd i
if ml < nln
then goColl arr bd i nln (i+1)
else goColl arr bd ms ml (i+1)
collatzLength :: STUArray s Int Int -> Int -> Int -> ST s Int
collatzLength arr bd n = go 1 n
where
go !l 1 = return l
go l m
| bd < m = go (l+1) $ case m .&. 1 of
0 -> m `shiftR` 1
_ -> 3*m+1
| otherwise = do
l' <- unsafeRead arr m
case l' of
0 -> do
l'' <- go 1 $ case m .&. 1 of
0 -> m `shiftR` 1
_ -> 3*m+1
unsafeWrite arr m (l''+1)
return (l + l'')
_ -> return (l+l'-1)
which does the job for a limit of 1000000 in 0.04 seconds when compiled with the NCG, 0.05 with the LLVM backend (apparently, that is not as good at optimising STUArray code as the NCG is).
If you don't have a 64-bit GHC, you can't simply use Int, since that would overflow then for some inputs.
But the overwhelming part of the computation is still performed in Int range, so you should use that where possible and only move to Integer where required.
switch :: Int
switch = (maxBound - 1) `quot` 3
back :: Integer
back = 2 * fromIntegral (maxBound :: Int)
cSeq :: Int -> Int
cSeq n = goInt 1 n
where
goInt acc 1 = acc
goInt acc m
| m .&. 1 == 0 = goInt (acc+1) (m `shiftR` 1)
| m > switch = goInteger (acc+1) (3*toInteger m + 1)
| otherwise = goInt (acc+1) (3*m+1)
goInteger acc m
| fromInteger m .&. (1 :: Int) == 1 = goInteger (acc+1) (3*m+1)
| m > back = goInteger (acc+1) (m `quot` 2) -- yup, quot is faster than shift for Integer here
| otherwise = goInt (acc + 1) (fromInteger $ m `quot` 2)
makes it harder to optimise the loop(s), so it is slower than the single loop using Int, but still decent. Here (where the Integer loop is never run), it takes 0.42 seconds with the NCG and 0.37 with the LLVM backend (which is pretty much the same as using quot in the pure Int version).
Using a similar trick for the memoised version has similar consequences, it's considerably slower than the pure Int version, but still blazingly fast compared to unmemoised versions.
(1) For this special (type of) problem, where you need to memoise the results for a contiguous range of arguments. For other problems, a Map or some other data structure will be the better choice.
It seems that the maximum function is the culprit as already pointed out, but you shouldn't have to worry about it if you compile your program with the -O2 flag.
The program is still quite slow, this is because the problem is supposed to teach you about memoization. One good way of doing this is haskell is by using Data.Memocombinators:
import Data.MemoCombinators
import Control.Arrow
import Data.List
import Data.Ord
import System.Environment
play m = maximumBy (comparing snd) . map (second threeNPuzzle) $ zip [1..] [1..m]
where
threeNPuzzle = arrayRange (1,m) memoized
memoized n
| n == 1 = 1
| odd n = 1 + threeNPuzzle (3*n + 1)
| even n = 1 + threeNPuzzle (n `div` 2)
main = getArgs >>= print . play . read . head
The above program runs in under a second when compiled with -O2 on my machine.
Note that in this case it is not a good idea to memoize all values found by threeNPuzzle, the program above memoizes the ones up until the limit (1000000 in the problem).

Resources