Haskell: repeat a function a large number of times without stackoverflow - haskell

As a newbie to Haskell I am trying to iterate a function (e.g., the logistic map) a large number of times. In an imperative language this would be a simple loop, however in Haskell I end up with stack overflow. Take for example this code:
main = print $ iter 1000000
f x = 4.0*x*(1.0-x)
iter :: Int -> Double
iter 0 = 0.3
iter n = f $ iter (n-1)
For a small number of iterations the code works, but for a million iterations I get a stack space overflow:
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
I cannot understand why this does happen. The tail recursion should be fine here.
Maybe the problem is lazy evaluation. I experimented with several ways to force strict evaluation, by inserting $! or seq at various positions, but with no success.
What would be the Haskell way to iterate a function a huge number of times?
I have tried suggestions from related posts: here or here, but I always ended up with stackoverflow for a large number of iterations, e.g., main = print $ iterate f 0.3 !! 1000000.

The problem is that your definition
iter :: Int -> Double
iter 0 = 0.3
iter n = f $ iter (n-1)
tries to evaluate in the wrong direction. Unfolding it for a few steps, we obtain
iter n = f (iter (n-1))
= f (f (iter (n-2)))
= f (f (f (iter (n-3))))
...
and the entire call stack from iter 1000000 to iter 0 has to be built before anything can be evaluated. It would be the same in a strict language. You have to organise it so that part of the evaluation can take place before recurring. The usual way is to have an accumulation parameter, like
iter n = go n 0.3
where
go 0 x = x
go k x = go (k-1) (f x)
Then adding strictness annotations - in case the compiler doesn't already add them - will make it run smoothly without consuming stack.
The iterate variant has the same problem as your iter, only the call stack is built inside-out rather than outside-in as for yours. But since iterate builds its call-stack inside-out, a stricter version of iterate (or a consumption pattern where earlier iterations are forced before) solves the problem,
iterate' :: (a -> a) -> a -> [a]
iterate' f x = x `seq` (x : iterate' f (f x))
calculates iterate' f 0.3 !! 1000000 without problem.

Related

Process large lists in Haskell into a single value

I have 2 lists of Int of equal size (roughly 10,000 elements): say x and y. I need to compute a product of the following expression for each corresponding pair of elements from the lists: x_i/(x_i+y_i), i.e. x_i and y_i are the first elements of the lists, then second, etc.
My approaches work fine on small test cases, but ghci hangs for the larger lists. Any insight as to the cause and solution would be appreciated.
I tried to do this with fold, zipping the lists first:
getP:: [Int] -> [Int] -> Double
getP zippedCounts = foldr (\(x,y) acc -> let intX = fromIntegral x
intY = fromIntegral y
intSum = intX + intY in
(acc*(intX/intSum)))
1.0
zippedCounts
I also tried recusion:
getP lst [] = 1.0
getP [] lst = 1.0
getP (h1:t1) (h2:t2) = ((fromIntegral h1) / ((fromIntegral h1) + (fromIntegral h2))) * (getP t1 t2)
As well as list comprehension:
getP lst1 lst2 = (product [((fromIntegral x) / ((fromIntegral x) + (fromIntegral y)))|x <- lst1, y <- lst2])
All three solutions have space leaks, maybe that's what causes the unresponsiveness.
In Haskell, when reducing a big list to a single summary value, it is very easy to inadvertently cause space leaks if we never "look into" the intermediate values of the computation. We can end up with a gigantic tree of unevaluated thunks hiding behind a seemingly inoffensive single Double value.
The foldr example leaks because foldr never forces its accumulator to weak head normal form. Use the strict left foldl' instead (you will need to reorder some function arguments). foldl' should ensure that the intermediate Double values remain "small" and thunks don't accumulate.
The explicit recursion example is dangerous because it is not tail-recursive and for large lists can cause a stack overflow (we repeatedly put values in the stack, waiting for the next recursive call to complete). A solution would involve making the function tail-recursive by passing the intermediate result as an extra parameter, and putting a bang pattern on that parameter to ensure thunks don't accumulate.
The product example leaks because, unfortunately, neither the sum nor the product functions are strict. For large lists, it's better to use foldl' instead. (There's also a bug, as it has been mentioned in the comments.)
You could try zipWith followed by product:
getP :: [Int] -> [Int] -> Double
getP xs ys = product $ zipWith func xs ys
where
func x y = let dx = fromIntegral x
dy = fromIntegral y
in dx / (dx + dy)
I would avoid using explicit recursion and stick to library functions for speed. You could also use certain ghc flags to speed up the compiled code.

Can "applying a function for n times" be done using "exponentiating by squaring"?

Given a function of type f :: a -> a, we can produce a function that applies f for n times:
nTimes :: Int -> (a -> a) -> (a -> a)
nTimes 0 _ = id
nTimes 1 f = f
nTimes n f = f . nTimes (n-1) f
I can use exponentiating by squaring method here to implement another nTimes function:
nTimes' :: Int -> (a -> a) -> (a -> a)
nTimes' = nTimes'' id
where
nTimes'' acc n f
| n == 0 = acc
| even n = nTimes'' acc (n `div` 2) (f . f)
| otherwise = nTimes'' (acc . f) (n-1) f
My question is:
Do nTimes and nTimes' always produce the same result?
Will nTimes' be faster?
Although they are equivalent, I would be extremely surprised if ntimes' were actually faster or memory-saving in any real situation. The problem is that unlike with the x * x doubling in ordinary exponentiation by squaring, f . f does not actually share any of the real work done when applying f. It is still going to turn in the end into applying the outermost single f to an argument constructed by all the remainder somehow. And ntimes (n-1) f x is going to be about the most compact representation you can have of that remainder until it is itself actually needed to be evaluated, which will require applying its leftmost f to a representation of ntimes (n-2) f x, etc.
EDIT: Let me add that this could change significantly if you were doing memoization, i.e. replacing f . f by memo (f . f) for some memo-combinator that modifies a function to remember its results. In that case actual work could be shared, and this version of ntimes' might sometimes be an improvement. Other times it could waste an awful lot of memory, though.
It will produce the same result in both cases, because both * and . are associative operators.
However, the "speedup" is not the speedup you might be thinking of. Exponentiation by squaring is good because it decreases the number of times the * operator is applied, from linear to logarithmic number of times. In this case, you are decreasing the number of times the . operator is applied, from linear to logarithmic number of times.
However, like Ørjan Johansen said, unlike *, the . operator doesn't really do much -- it just takes two function values, and outputs a new function value, which essentially wraps the given two functions plus some code.
The resulting function that you get from nTimes', when applied to a value, must still run f n times. Therefore, there is no improvement in actually running the resulting function, only an improvement in the process of constructing the resulting function using ..
Do nTimes and nTimes' always produce the same result?
Yes. (Unless you have bugs, didn’t check that).
Will nTimes' be faster?
Probably not significantly . f itself is shared in both cases, so there is no recomputation there.
If f is not sufficiently lazy you are building a list of references to f vs. a tree with sharing of references to f, so you are saving a bit of memory here.

Why is this tail-recursive Haskell function slower ?

I was trying to implement a Haskell function that takes as input an array of integers A
and produces another array B = [A[0], A[0]+A[1], A[0]+A[1]+A[2] ,... ]. I know that scanl from Data.List can be used for this with the function (+). I wrote the second implementation
(which performs faster) after seeing the source code of scanl. I want to know why the first implementation is slower compared to the second one, despite being tail-recursive?
-- This function works slow.
ps s x [] = x
ps s x y = ps s' x' y'
where
s' = s + head y
x' = x ++ [s']
y' = tail y
-- This function works fast.
ps' s [] = []
ps' s y = [s'] ++ (ps' s' y')
where
s' = s + head y
y' = tail y
Some details about the above code:
Implementation 1 : It should be called as
ps 0 [] a
where 'a' is your array.
Implementation 2: It should be called as
ps' 0 a
where 'a' is your array.
You are changing the way that ++ associates. In your first function you are computing ((([a0] ++ [a1]) ++ [a2]) ++ ...) whereas in the second function you are computing [a0] ++ ([a1] ++ ([a2] ++ ..)). Appending a few elements to the start of the list is O(1), whereas appending a few elements to the end of a list is O(n) in the length of the list. This leads to a linear versus quadratic algorithm overall.
You can fix the first example by building the list up in reverse order, and then reversing again at the end, or by using something like dlist. However the second will still be better for most purposes. While tail calls do exist and can be important in Haskell, if you are familiar with a strict functional language like Scheme or ML your intuition about how and when to use them is completely wrong.
The second example is better, in large part, because it's incremental; it immediately starts returning data that the consumer might be interested in. If you just fixed the first example using the double-reverse or dlist tricks, your function will traverse the entire list before it returns anything at all.
I would like to mention that your function can be more easily expressed as
drop 1 . scanl (+) 0
Usually, it is a good idea to use predefined combinators like scanl in favour of writing your own recursion schemes; it improves readability and makes it less likely that you needlessly squander performance.
However, in this case, both my scanl version and your original ps and ps' can sometimes lead to stack overflows due to lazy evaluation: Haskell does not necessarily immediately evaluate the additions (depends on strictness analysis).
One case where you can see this is if you do last (ps' 0 [1..100000000]). That leads to a stack overflow. You can solve that problem by forcing Haskell to evaluate the additions immediately, for instance by defining your own, strict scanl:
myscanl :: (b -> a -> b) -> b -> [a] -> [b]
myscanl f q [] = []
myscanl f q (x:xs) = q `seq` let q' = f q x in q' : myscanl f q' xs
ps' = myscanl (+) 0
Then, calling last (ps' [1..100000000]) works.

Project Euler 23: insight on this stackoverflow-ing program needed

Hi haskell fellows. I'm currently working on the 23rd problem of Project Euler. Where I'm at atm is that my code seems right to me - not in the "good algorithm" meaning, but in the "should work" meaning - but produces a Stack memory overflow.
I do know that my algorithm isn't perfect (in particular I could certainly avoid computing such a big intermediate result at each recursion step in my worker function).
Though, being in the process of learning Haskell, I'd like to understand why this code fails so miserably, in order to avoid this kind of mistakes next time.
Any insight on why this program is wrong will be appreciated.
import qualified Data.List as Set ((\\))
main = print $ sum $ worker abundants [1..28123]
-- Limited list of abundant numbers
abundants :: [Int]
abundants = filter (\x -> (sum (divisors x)) - x > x) [1..28123]
-- Given a positive number, returns its divisors unordered.
divisors :: Int -> [Int]
divisors x | x > 0 = [1..squareRoot x] >>=
(\y -> if mod x y == 0
then let d = div x y in
if y == d
then [y]
else [y, d]
else [])
| otherwise = []
worker :: [Int] -> [Int] -> [Int]
worker (a:[]) prev = prev Set.\\ [a + a]
worker (a:as) prev = worker as (prev Set.\\ (map ((+) a) (a:as)))
-- http://www.haskell.org/haskellwiki/Generic_number_type#squareRoot
(^!) :: Num a => a -> Int -> a
(^!) x n = x^n
squareRoot :: Int -> Int
squareRoot 0 = 0
squareRoot 1 = 1
squareRoot n =
let twopows = iterate (^!2) 2
(lowerRoot, lowerN) =
last $ takeWhile ((n>=) . snd) $ zip (1:twopows) twopows
newtonStep x = div (x + div n x) 2
iters = iterate newtonStep (squareRoot (div n lowerN) * lowerRoot)
isRoot r = r^!2 <= n && n < (r+1)^!2
in head $ dropWhile (not . isRoot) iters
Edit: the exact error is Stack space overflow: current size 8388608 bytes.. Increasing the stack memory limit through +RTS -K... doesn't solve the problem.
Edit2: about the sqrt thing, I just copy pasted it from the link in comments. To avoid having to cast Integer to Doubles and face the rounding problems etc...
In the future, it's polite to attempt a bit of minimalization on your own. For example, with a bit of playing, I was able to discover that the following program also stack-overflows (with an 8M stack):
main = print (worker [1..1000] [1..1000])
...which really nails down just what function is screwing you over. Let's take a look at worker:
worker (a:[]) prev = prev Set.\\ [a + a]
worker (a:as) prev = worker as (prev Set.\\ (map ((+) a) (a:as)))
Even on my first read, this function was red-flagged in my mind, because it's tail-recursive. Tail recursion in Haskell is generally not such a great idea as it is in other languages; guarded recursion (where you produce at least one constructor before recursing, or recurse some small number of times before producing a constructor) is generally better for lazy evaluation. And in fact, here, what's happening is that each recursive call to worker is building a deeper- and deeper-ly nested thunk in the prev argument. When the time comes to finally return prev, we have to go very deeply into a long chain of Set.\\ calls to work out just what it was we finally have.
This problem is obfuscated slightly by the fact that the obvious strictness annotation doesn't help. Let's massage worker until it works. The first observation is that the first clause is completely subsumed by the second one. This is stylistic; it shouldn't affect the behavior (except on empty lists).
worker [] prev = prev
worker (a:as) prev = worker as (prev Set.\\ map (a+) (a:as))
Now, the obvious strictness annotation:
worker [] prev = prev
worker (a:as) prev = prev `seq` worker as (prev Set.\\ map (a+) (a:as))
I was surprised to discover that this still stack overflows! The sneaky thing is that seq on lists only evaluates far enough to learn whether the list matches either [] or _:_. The following does not stack overflow:
import Control.DeepSeq
worker [] prev = prev
worker (a:as) prev = prev `deepseq` worker as (prev Set.\\ map (a+) (a:as))
I didn't plug this final version back into the original code, but it at least works with the minimized main above. By the way, you might like the following implementation idea, which also stack overflows:
import Control.Monad
worker as bs = bs Set.\\ liftM2 (+) as as
but which can be fixed by using Data.Set instead of Data.List, and no strictness annotations:
import Control.Monad
import Data.Set as Set
worker as bs = toList (fromList bs Set.\\ fromList (liftM2 (+) as as))
As Daniel Wagner correctly said, the problem is that
worker (a:as) prev = worker as (prev Set.\\ (map ((+) a) (a:as)))
builds a badly nested thunk. You can avoid that and get somewhat better performance than with deepseq by exploiting the fact that both arguments to worker are sorted in this application. Thus you can get incremental output by noting that at any step everything in prev smaller than 2*a cannot be the sum of two abundant numbers, so
worker (a:as) prev = small ++ worker as (large Set.\\ map (+ a) (a:as))
where
(small,large) = span (< a+a) prev
does better. However, it's still bad because (\\) cannot use the sortedness of the two lists. If you replace it with
minus xxs#(x:xs) yys#(y:ys)
= case compare x y of
LT -> x : minus xs yys
EQ -> minus xs ys
GT -> minus xxs ys
minus xs _ = xs -- originally forgot the case for one empty list
(or use the data-ordlist package's version), calculating the set-difference is O(length) instead of O(length^2).
Ok, I loaded it up and gave it a shot. Daniel Wagner's advice is pretty good, probably better than mine. The problem is indeed with the worker function, but I was going to suggest using Data.MemoCombinators to memoize your function instead.
Also, your divisors algorithm is kind of silly. There's a much better way to do that. It's kind of mathy and would require a lot of TeX, so here's a link to a math.stackexchange page about how to do that. The one I was talking about, was the accepted answer, though someone else gives a recursive solution that I think would run faster. (It doesn't require prime factorization.)
https://math.stackexchange.com/questions/22721/is-there-a-formula-to-calculate-the-sum-of-all-proper-divisors-of-a-number

If I have an expression, which is partitially evaluable, is it a good idea to avoid tail-recursion?

Consider an haskell-expression like the following: (Trivial example, don't tell me what the obvious way is! ;)
toBits :: Integral a => a -> [Bool]
toBits 0 = []
toBits n = x : toBits m where
(m,y) = n `divMod` 2
x = y /= 0
Because this function is not tail-recursive, one could also write:
toBits :: Integral a => a -> [Bool]
toBits = toBits' [] where
toBits' l 0 = l
toBits' l n = toBits (x : l) m where
(m,y) = n `divMod` 2
x = y /= 0
(I hope there is nothing wron whithin this expression)
What I want to ask is, which one of these solutions is better. The advantage of the first one is, that it can be evaluated partitially very easy (because Haskell stops at the first : not needed.), but the second solution is (obviously) tail-recursive, but in my opinion it needs to be completely evaluated until you can get something out.
The background for this is my Brainfuck parser, (see my optimization question), which is implemented very uggly (various reverse instructions... ooh), but can be implemented easily in the first - let's call it "semi-tail-recursion" way.
I think you've got it all just right. The first form is in general better because useful output can be obtained from it before it has completed computation. That means that if 'toBits' is used in another computation the compiler can likely combine them and the list that is the output of 'toBits' may never exist at all, or perhaps just one cons cell at a time. Nice that the first version is also more clear to read!
In Haskell, your first choice would typically be preferred (I would say "always," but you're always wrong when you use that word). The accumulator pattern is appropriate for when the output can not be consumed incrementally (e.g. incrementing a counter).
Let me rename the second version and fix a few typos so that you can test the functions.
toBits :: Integral a => a -> [Bool]
toBits 0 = []
toBits n = x : toBits m where
(m,y) = n `divMod` 2
x = y /= 0
toBits2 :: Integral a => a -> [Bool]
toBits2 = toBits' [] where
toBits' l 0 = l
toBits' l n = toBits' (x : l) m where
(m,y) = n `divMod` 2
x = y /= 0
These functions don't actually compute the same thing; the first one produces a list starting with the least significant bit, while the second one starts with the most significant bit. In other words toBits2 = reverse . toBits, and in fact reverse can be implemented with exactly the same kind of accumulator that you use in toBits2.
If you want a list starting from the least significant bit, toBits is good Haskell style. It won't produce a stack overflow because the recursive call is contained inside the (:) constructor which is lazy. (Also, you can't cause a thunk buildup in the argument of toBits by forcing the value of a late element of the result list early, because the argument needs to be evaluated in the first case toBits 0 = [] to determine whether the list is empty.)
If you want a list starting from the most significant bit, either writing toBits2 directly or defining toBits and using reverse . toBits is acceptable. I would probably prefer the latter since it's easier to understand in my opinion and you don't have to worry about whether your reimplementation of reverse will cause a stack overflow.

Resources