Tail-recursive function consuming memory - haskell

I have a clearly tail-recursive function for finding (choose n k) mod 10007 (with k nonnegative)
Why is this function consuming lots of memory for large inputs? (ie 100000000 choose 50000000) I can understand if it might be slow, but it shouldn't use more than constant memory, should it? (assuming GHC knows about tail-call optimization)
GHC version 7.8.3
modulus :: Int
modulus = 10007
choose :: Int -> Int -> Int
choose n1 k1
| s1 > 0 = 0
| otherwise = q1
where
(q1, s1) = doChoose n1 k1 (1, 0)
doChoose :: Int -> Int -> (Int, Int) -> (Int, Int)
doChoose _ 0 (qr, sr) = (qr, sr)
doChoose n k (qr, sr) =
doChoose (n `seq` (n-1)) (k-1) (qr `seq` (qn * qr `rem` modulus * inv qk `rem` modulus), sr `seq` (sn + sr - sk))
where
(qn, sn) = removePs n
(qk, sk) = removePs k
removePs :: Int -> (Int, Int)
removePs n =
case r of
0 -> (q0, s0 + 1)
_ -> (n, 0)
where
(q, r) = n `quotRem` modulus
(q0, s0) = removePs q
inv :: Int -> Int
inv = doInv 0 1 modulus . (`mod` modulus)
where
doInv x _ 1 0
| x < 0 = x + modulus
| otherwise = x
doInv _ _ _ 0 = error "Not relatively prime"
doInv x y a b = doInv y (x - q * y) b r
where
(q, r) = a `quotRem` b

I was putting the seq in the wrong place.
It needs to be:
n `seq` qr `seq` sr `seq` doChoose (n-1) (k-1) (qn * qr `rem` modulus * inv qk `rem` modulus, sn + sr - sk)
Otherwise the call to seq isn't evaluated until reaching the base-case and a chain of thunks is still built up.
This isn't strictly tail-recursive, but rather it's "mutually" tail-recursive since seq ultimately returns its second argument without modifying it.

By the way, to simplify your expressions, you can write a helper function:
force x = x `seq` x
or use force (no pun intended) from the Deepseq package. Then
doChoose (force n - 1) (k - 1) (qn * force qr * etc.)

Related

Fermat Primality Test Haskell

I have implemented the following two functions for establishing if n is a fermat prime number (will return n if its true, -1 if not), but it returns always -1, can't figure out why (gc is a funct taht calculates gcd)
fermatPT :: Int -> Int
fermatPT n = fermatPT' n list
where
list = [a | a <- [1..n-1]]
-- | heper function
fermatPT' :: Int -> [Int] -> Int
fermatPT' n l | gc (n, head l) == 1 && fermatTest n (head l) = fermatPT' n (tail l)
| null l = n
| otherwise = -1
where
fermatTest n a = mod (a^(n-1)) n == 1
Your function should return a boolean indicating if the given number is a prime. If you do that, you can use the all function to define this simply as
fermatPT :: Integer -> Bool
fermatPT n = all (fermatTest n) (filter (\a -> gcd n a == 1) [1..n-1])
where fermatTest n a = mod (a^(n-1)) n == 1
gcd is defined in the Prelude.
all avoids the explicit recursion that requires you to apply the test to one element of [1..n-1] at a time; its definition is effectively
all _ [] = True
all p (x:xs) = p x && all p xs
Note that mod (a ^ (n - 1)) n is inefficient, since it may require computing an absurdly large number before ultimately reducing it to the range [0..n-1]. Instead, take advantage of the fact that ab mod n == (a mod n * b mod n) mod n, and reduce the value after each multiplication. One way to implement this (not the fastest, but it's simple):
modN :: Integer -> Integer -> Integer -> Integer
modN a 0 _ = 1
modN a b n = ((a `mod` n) * (modN a (b - 1) n)) `mod` n
Then use
fermatTest n a = modN a (n-1) n == 1
Note that you could use this (with Int instead of Integer) to correctly implement fermatPT :: Int -> Bool; although the input would still be restricted to smaller integers, it won't suffer from overflow.

Memoization with Monad.Memo for mutual recursion in Haskell

I'm doing some dynamic programming in Haskell with mutual recursion implementation.
I decided to speed things up using memoization.
Monad.Memo offers MemoT transformer for that exact case. But it uses Map as internal representation for stored values. And while this gave me order of magnitude speed boost it is still not enough.
While lib supports Array-based and Vector-based implementation as internal storage it only works for simple recursion and I did not found any transformers like MemoT to use it for mutual recursion.
What is the best way to do mutual recursion memoization with efficient vector based internal representation (if any)?
My next question is about memoization effect. So I expected my function to take more time during first run and much less during consecutive runs. But what I found running it in ghci the time it takes each time is the same. So no difference between first and second run. I measured time as follows:
timeit $ print $ dynamic (5,5)
With dynamic being my function.
The full implementation is as follows:
import Control.Monad.Memo
import Control.Monad.Identity
type Pos = (Int, Int)
type MemoQ = MemoT (Int, Int, Int) [Int]
type MemoV = MemoT (Int, Int, Int) Int
type MemoQV = MemoQ (MemoV Identity)
-- we are moving to (0,0) as we can always shift the world by substituting variables
-- due to symmetry of cost function it is enougth to solve for only positive x and y
dynamic :: Pos -> [Int]
dynamic (x, y) = lastUnique $ map (evalQ x y) [1 ..]
where lastUnique (x0:x1:xs) | x0 == x1 = x0
| otherwise = lastUnique (x1:xs)
evalQ :: Int -> Int -> Int -> [Int]
evalQ x y n = startEvalMemo . startEvalMemoT $ fqmon x y n
fqmon :: Int -> Int -> Int -> MemoQV [Int]
fqmon _ _ 0 = return [0,0,0,0]
fqmon x y n = do
let pts = neighbours (x, y)
let v = for3 memol1 fvmon n
let c = cost (x, y)
let q = fmap (c +) . uncurry v
traverse q pts
fvmon :: Int -> Int -> Int -> MemoQV Int
fvmon _ 0 0 = return 0
fvmon 0 x y = return $ cost (x, y)
fvmon n x y | limit = return 1000000
| otherwise = liftM minimum $ for3 memol0 fqmon x' y' (n - 1)
where x' = abs x
y' = abs y
limit = x' > 25 || y' > 25
cost :: Pos -> Int
cost (x, y) = abs x + abs y
neighbours :: Pos -> [Pos]
neighbours (x, y) = [(x-1, y), (x+1, y), (x, y-1), (x, y+1)]
Added:
According to #liqui comment I tried memcombinators.
So first is the non memoized initial implementation:
type Pos = (Int, Int)
dynamic :: Int -> Int -> [Int]
dynamic x y = lastUnique $ map (fq x y) [1 ..]
where lastUnique (x0:x1:xs) | x0 == x1 = x0
| otherwise = lastUnique (x1:xs)
fq :: Int -> Int -> Int -> [Int]
fq _ _ 0 = [0, 0, 0, 0] -- Q at 0 step is 0 in all directions
fq x y n = (cost (x, y) +) . (uncurry $ fv n) <$> neighbours (x, y)
fv :: Int -> Int -> Int -> Int
fv _ 0 0 = 0 -- V at (0, 0) is 0 at any atep
fv 0 x y = cost (x, y) -- V at 0 step is a cost
fv n x y = minimum $ fq x y (n - 1)
cost :: Pos -> Int
cost (x, y) = abs x + abs y
neighbours :: Pos -> [Pos]
neighbours (x, y) = [(x-1, y), (x+1, y), (x, y-1), (x, y+1)]
Then my attempt to memization (only changed part):
dynamic :: Int -> Int -> [Int]
dynamic x y = lastUnique $ map (fqmem x y) [1 ..]
where lastUnique (x0:x1:xs) | x0 == x1 = x0
| otherwise = lastUnique (x1:xs)
-- memoizing version of fq
fqmem :: Int -> Int -> Int -> [Int]
fqmem x y n = fqmem' x y n
where fqmem' = memo3 integral integral integral fq
-- memoizing version of fv
fvmem :: Int -> Int -> Int -> Int
fvmem n x y = fvmem' n x y
where fvmem' = memo3 integral integral integral fv
fq :: Int -> Int -> Int -> [Int]
fq _ _ 0 = [0, 0, 0, 0] -- Q at 0 step is 0 in all directions
fq x y n = (cost (x, y) +) . (uncurry $ fvmem n) <$> neighbours (x, y)
fv :: Int -> Int -> Int -> Int
fv _ 0 0 = 0 -- V at (0, 0) is 0 at any atep
fv 0 x y = cost (x, y) -- V at 0 step is a cost
fv n x y = minimum $ fqmem x y (n - 1)
The result a bit of paradox. It is 3 time slower than non memoized recursive implementation. Memoizing only one function (namely fq) and not touching fv gives results 2 times slower. The more I memoize with memcombinators the slower the computation. And again no difference between first and second invocation.
Also the last question. What is the rationale for choosing between Monad.Memo or memcombinators or MemotTrie? There is a point on using last 2 in comments. What are the situations when Monad.Memo is a better choice?
Finally MemoTrie did the job.
At first invocation it works as fast (possibly much faster) than Monad.Memo and at consecutive invocations it take virtually no time!
And tha change in code is trivial compared to monadic approach:
import Data.MemoTrie
type Pos = (Int, Int)
-- we are moving to (0,0) as we can always shift the world by substituting variables
-- due to symmetry it is enougth to solve for only positive x and y
dynamic :: Int -> Int -> [Int]
dynamic x y = lastUnique $ map (fqmem x y) [1 ..]
where lastUnique (x0:x1:xs) | x0 == x1 = x0
| otherwise = lastUnique (x1:xs)
fqmem = memo3 fq
fvmem = memo3 fv
fq :: Int -> Int -> Int -> [Int]
fq _ _ 0 = [0, 0, 0, 0] -- Q at 0 step is 0 in all directions
fq x y n = (cost (x, y) +) . (uncurry $ fvmem n) <$> neighbours (x, y)
fv :: Int -> Int -> Int -> Int
fv _ 0 0 = 0 -- V at (0, 0) is 0 at any atep
fv 0 x y = cost (x, y) -- V at 0 step is a cost
fv n x y = minimum $ fqmem x y (n - 1)
cost :: Pos -> Int
cost (x, y) = abs x + abs y
neighbours :: Pos -> [Pos]
neighbours (x, y) = [(x-1, y), (x+1, y), (x, y-1), (x, y+1)]
Still I would like to know what is the benefits of using Monad.Memo and what are use cases for that? Or it becomes obsolete with MemoTrie?
Why Memocombinators did not worked for me?
What is the rule of thumb on choosing between Monad.Memo, Memocombinators or MemoTrie?

Haskell Memoization Codewars Number of trailing zeros of factorial n

I am trying to solve the Codewars problem called: Number of trailing zeros of N! with Haskell.
I know that I don't need to calculate the factorial to know the trailing zeros and in fact I am just counting how many many numbers are divisible by 5 and how many times for each.
I have written 2 version, one that uses memoization when defactoring a number in order to get how many times is divisible by 5 and another one that do not use memoization.
What surprise me is that the supposed DP approach takes longer than the trivial recursive one. I am probably doing something very stupid in my code.
These are the functions:
zeros x = helperZeros [1..x]
helperZeros :: [Integer] -> Integer
helperZeros = sumArrayTuple . filter (\x -> x `mod` 5 == 0)
sumArrayTuple = foldl (\acc x -> acc + (fastDef x)) 0
data Tree a = Tree (Tree a) a (Tree a)
instance Functor Tree where
fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r)
index :: Tree Integer -> Integer -> Integer
index (Tree _ m _) 0 = m
index (Tree l _ r) n = case (n-1) `divMod` 2 of
(q,0) -> index l q
(q,1) -> index r q
nats = go 0 1
where
go n s = Tree (go l s') n (go r s' )
where
l = n + s
r = l + s
s' = s * 2
fastDef:: Integer -> Integer
fastDef x = trace (show x) index memTreetDef x
memTreetDef = fmap (defact fastDef) nats
defact f n
| n `mod` 5 /= 0 = 0
| otherwise = 1 + f (n `div` 5)
zeros' x = helperZeros' [1..x]
helperZeros' :: [Integer] -> Integer
helperZeros' = sumArrayTuple' . filter (\x -> x `mod` 5 == 0)
sumArrayTuple' = foldl (\acc x -> acc + (def x)) 0
def n
| n `mod` 5 /= 0 = 0
| otherwise = 1 + def (n `div` 5)
What I am trying to memoize is the result of the defact function, for example if I have already calculate defact 200, then it would reuse this result to calculate defact 1000.
I am fairly new to DP in Haskell.
If you are tested your code performance with trace and show here, that is the issue: they are very slow compared to the main code. If not, performance of variants must be about the same.
The def function is a poor candidate for memoization. The average depth of recursion is not very different from 1. The rest of the complexity is reduced to the operation mod, that is, the division that is hardly more expensive than table look up (and division by constant can be optimized to multiplication).

Least common multiple without using gcd

With gcd its fairly easy but i do not understand how to tie in all the functions to make it happen without.
kgv :: Int -> Int -> Int
kgv x y = abs ((x `quot` (gcd x y)) * y)
I got this function to find the prime factors which works (prime_factors) and I am working on making a function that takes the maximum number from one list and checks if its on the other list (comp):
prime_factors :: Int -> [Int]
prime_factors 1 = []
prime_factors n
| factors == [] = [n]
| otherwise = factors ++ prime_factors (n `div` (head factors))
where factors = take 1 $ filter (\x -> (n `mod` x) == 0) [2 .. n-1]
comp :: [Int]->Int
comp (ys)(x:xs)
|maximum prime_factors xs elem prime_factors ys == x
|otherwise tail x
kgv :: Int -> Int -> Int
kgv x y = abs ((x `quot` (comp x y)) * y)
Here's an absurdly simple and obscenely inefficient solution:
lcm m n = head [x | x <- [1..], x `rem` m == 0, x `rem` n == 0]
Of course, this relies on two different notions of "least" coinciding under the circumstances, which they do. A fully naive solution doesn't seem possible.
here is the (very) naive algorithm I was talking about:
kgv :: (Ord a, Num a) => a -> a -> a
kgv x y = find x y
where find i j
| i == j = i
| i < j = find (i+x) j
| i > j = find i (j+y)
it's basically what a school-child would do ;)
caution I ignored negative numbers and 0 - you'll probably have to handle those
perhaps another easy way is
import Data.List(intersect)
lcm m n = head $ intersect (series m n) (series n m)
where series a b = take a $ map (*b) [1..]
I figured it out myself mostly. Thanks for the ideas and pointers.
ggt n m | n > m = maximum [t | t <- [1 .. m], gt n m t]
| otherwise = maximum [t | t <- [1 .. n], gt n m t]
gt n m c = t n c && t m c
t n c | n >= c = (mod n c == 0)
| otherwise = False
kgv :: Int -> Int -> Int
kgv x y |x==0=0|y==0=0 |otherwise = abs ((x `quot` (ggt x y)) * y)

Two parameter memoization in Haskell

I'm trying to memoize the following function:
gridwalk x y
| x == 0 = 1
| y == 0 = 1
| otherwise = (gridwalk (x - 1) y) + (gridwalk x (y - 1))
Looking at this I came up with the following solution:
gw :: (Int -> Int -> Int) -> Int -> Int -> Int
gw f x y
| x == 0 = 1
| y == 0 = 1
| otherwise = (f (x - 1) y) + (f x (y - 1))
gwlist :: [Int]
gwlist = map (\i -> gw fastgw (i `mod` 20) (i `div` 20)) [0..]
fastgw :: Int -> Int -> Int
fastgw x y = gwlist !! (x + y * 20)
Which I then can call like this:
gw fastgw 20 20
Is there an easier, more concise and general way (notice how I had to hardcode the max grid dimensions in the gwlist function in order to convert from 2D to 1D space so I can access the memoizing list) to memoize functions with multiple parameters in Haskell?
You can use a list of lists to memoize the function result for both parameters:
memo :: (Int -> Int -> a) -> [[a]]
memo f = map (\x -> map (f x) [0..]) [0..]
gw :: Int -> Int -> Int
gw 0 _ = 1
gw _ 0 = 1
gw x y = (fastgw (x - 1) y) + (fastgw x (y - 1))
gwstore :: [[Int]]
gwstore = memo gw
fastgw :: Int -> Int -> Int
fastgw x y = gwstore !! x !! y
Use the data-memocombinators package from hackage. It provides easy to use memorization techniques and provides an easy and breve way to use them:
import Data.MemoCombinators (memo2,integral)
gridwalk = memo2 integral integral gridwalk' where
gridwalk' x y
| x == 0 = 1
| y == 0 = 1
| otherwise = (gridwalk (x - 1) y) + (gridwalk x (y - 1))
Here is a version using Data.MemoTrie from the MemoTrie package to memoize the function:
import Data.MemoTrie(memo2)
gridwalk :: Int -> Int -> Int
gridwalk = memo2 gw
where
gw 0 _ = 1
gw _ 0 = 1
gw x y = gridwalk (x - 1) y + gridwalk x (y - 1)
If you want maximum generality, you can memoize a memoizing function.
memo :: (Num a, Enum a) => (a -> b) -> [b]
memo f = map f (enumFrom 0)
gwvals = fmap memo (memo gw)
fastgw :: Int -> Int -> Int
fastgw x y = gwvals !! x !! y
This technique will work with functions that have any number of arguments.
Edit: thanks to Philip K. for pointing out a bug in the original code. Originally memo had a "Bounded" constraint instead of "Num" and began the enumeration at minBound, which would only be valid for natural numbers.
Lists aren't a good data structure for memoizing, though, because they have linear lookup complexity. You might be better off with a Map or IntMap. Or look on Hackage.
Note that this particular code does rely on laziness, so if you wanted to switch to using a Map you would need to take a bounded amount of elements from the list, as in:
gwByMap :: Int -> Int -> Int -> Int -> Int
gwByMap maxX maxY x y = fromMaybe (gw x y) $ M.lookup (x,y) memomap
where
memomap = M.fromList $ concat [[((x',y'),z) | (y',z) <- zip [0..maxY] ys]
| (x',ys) <- zip [0..maxX] gwvals]
fastgw2 :: Int -> Int -> Int
fastgw2 = gwByMap 20 20
I think ghc may be stupid about sharing in this case, you may need to lift out the x and y parameters, like this:
gwByMap maxX maxY = \x y -> fromMaybe (gw x y) $ M.lookup (x,y) memomap

Resources