As a short exercise in using Haskell arrays I wanted to implement a function giving the first n (odd) prime numbers. The code below (compiled with GHC 7.10.3) produces a loop error at runtime. "A Gentle Introduction to Haskell" uses recursive calls in array creation to compute Fibonacci numbers (https://www.haskell.org/tutorial/arrays.html, 13.2, code below for reference), which works just fine. My question is:
Where is the difference between the two ways of recursive creation? Which recursive calls are generally allowed when creating arrays?
My code:
import Data.Array.Unboxed
main = putStrLn $ show $ (primes 500)!500 --arbitrary example
primes :: Int -> UArray Int Int
primes n = a
where
a = array (1,n) $ primelist 1 [3,5..]
primelist i (m:ms) =
if all (not . divides m) [ a!j | j <- [1..(i-1)]]
then (i ,m) : primelist (succ i) ms
else primelist i ms
divides m k = m `mod` k == 0
Code from "A Gentle Introduction to Haskell":
fibs :: Int -> Array Int Int
fibs n = a where a = array (0,n) ([(0, 1), (1, 1)] ++
[(i, a!(i-2) + a!(i-1)) | i <- [2..n]])
Thanks in advance for any answers!
Update: I think I finally understood what's going on. array is lazy on the list elements, but is unnecessarily strict on its spine!
This causes a <<loop>> exception, for instance
test :: Array Int Int
test = array (1,2) ((1,1) : if test!1 == 1 then [(2,2)] else [(2,100)])
unlike
test :: Array Int Int
test = array (1,2) ((1,1) : [(2, if test!1 == 1 then 2 else 100)])
So, recursion works as long as it only affects the values.
A working version:
main :: IO ()
main = do
putStrLn $ show $ (primes 500)!500 --arbitrary example
-- A spine-lazy version of array
-- Assumes the list carries indices lo..hi
arraySpineLazy :: (Int, Int) -> [(Int, a)] -> Array Int a
arraySpineLazy (lo,hi) xs = array (lo,hi) $ go lo xs
where
go i _ | i > hi = []
go i ~((_,e):ys) = (i, e) : go (succ i) ys
primes :: Int -> Array Int Int
primes n = a
where
a :: Array Int Int
a = arraySpineLazy (1,n) $ primelist 1 (2: [3,5..])
primelist :: Int -> [Int] -> [(Int, Int)]
primelist i _ | i > n = []
primelist _ [] = [] -- remove warnings
primelist i (m:ms) =
if all (not . divides m) [ a!j | j <- [1..(i-1)]]
then (i ,m) : primelist (succ i) ms
else primelist i ms
divides m k = m `mod` k == 0
Arguably, we should instead write a lazier variant of listArray instead, since our array variant discard the first components of the pair.
This is a strictness issue: you can't generate unboxed arrays recursively, only boxed (regular) ones, since only boxed ones have a lazy semantics.
Forget arrays, and consider the following recursive pair definition
let (x,y) = (0,x)
This defines x=0 ; y=0, recursively. However, for the recursion to work, it is necessary that the pair is lazy. Otherwise, it generates an infinite recursion, much as the following would do:
let p = case p of (x,y) -> (0,x)
Above, p evaluates itself before it can expose the (,) pair constructor, so an infinite loop arises. By comparison,
let p = (0, case p of (x,y) -> x)
would work, since p produces the (,) before calling itself. Note however that this relies on the constructor (,) not evaluating the components before returning -- it has to be lazy, and return immediately leaving the components to be evaluated later.
Operationally, a pair is constructed having inside tho thunks: two pointers to code, which will evaluate the result later on. Hence the pair is not really a pair of integers, but a pair of indirections-to-integer. This is called "boxing", and is needed to achieve laziness, even if it carries a little computational cost.
By definition, unboxed data structures, like unboxed arrays, avoid boxing, so they are strict, not lazy, and they can not support the same recursion approaches.
Related
The list monad provides an excellent abstraction for backtracking in
search problems. However, the problem I am facing now is one which
involves state in addition to backtracking. (It also involves
constraints related to previous choices made in the search path, but I
will attack that issue later.)
The following simplified example illustrates the problematics. The
function sumTo is given a nonnegative integer and a list with pairs
of integers. The first element in each pair is a positive integer, the
second element is the number of such integers available. The search
problem is to express the first argument using the integers in the
list, with count constraints. For example, here the integer 8 is
represented in different ways as sums of five 1s, three 2s and two
4s with the contraint that all numbers making up the sum have to be
even (so the 1s can not be used).
λ> sumTo 8 [(1,5), (4,2), (2,3)]
[[4,4],[4,2,2],[2,2,4],[2,4,2]]
The following is my current recursive solution to the problem.
sumTo :: Int -> [(Int, Int)] -> [[Int]]
sumTo = go []
where
go :: [(Int, Int)] -> Int -> [(Int, Int)] -> [[Int]]
go _ 0 _ = [[]] -- base case: success
go _ _ [] = [] -- base case: out of options, failure
-- recursion step: use the first option if it has counts left and
-- is suitable; append to this cases where current option is not
-- used at this point
go prevOpts n (opt#(val,cnt):opts) =
(if cnt > 0 && val <= n && even val
then map (val:) $ go [] (n - val) $ (val,cnt-1):(prevOpts ++ opts)
else [])
++ go (opt:prevOpts) n opts
While the function seems to work ok, it is much more complicated
than one without state, employing the list monad.
sumToN :: Int -> [Int] -> [[Int]]
sumToN 0 _ = [[]]
sumToN n opts = do
val <- opts
guard $ val <= n
guard $ even val
map (val:) $ sumToN (n - val) opts
Not having constraints, this one gives one additional solution.
λ> sumToN 8 [1, 4, 2]
[[4,4],[4,2,2],[2,4,2],[2,2,4],[2,2,2,2]]
Now I am wondering if some higher order abstraction, such as
StateT or something similar, could be utilized to simplify the case
of backtracking with this kind of state constraints.
There are two versions below, the first that just uses lists, and the second with StateT.
import Control.Applicative
import Control.Monad.State
The list type is the type of nondeterministic computations.
Given a set of elements (given in compact form as a list of (element, nb_copies)), we can pick any one, and return it together with the updated set. The result is a pair (Int, [(Int, Int)]). As a regular function, pick returns all possible results of that action.
Internally, we can also follow the definition with an "imperative" point of view. If the list is empty, there is nothing to pick (the empty list is the failing computation). Otherwise, there is at least one element x (implicitly, i > 0). Then we either pick one x (pickx), or we pick one element from the rest (pickxs), being careful to put x back at the end.
pick :: [(Int, Int)] -> [(Int, [(Int, Int)])]
pick [] = []
pick ((x, i) : xs) = pickx ++ pickxs
where
pickx = if i == 1 then [ (x, xs) ] else [ (x, (x, i-1) : xs) ]
pickxs = do
(x', xs') <- pick xs
return (x', (x, i) : xs')
Then sumTo is defined as follows: if n = 0 then the only solution is the empty sum ([]) and we return it. Otherwise, we pick one element i from the set, check its validity, and recursively look for a solution for n-i with the updated set.
sumTo :: Int -> [(Int, Int)] -> [[Int]]
sumTo = go
where
go 0 _ = return []
go n xs = do
(i, xs') <- pick xs
guard $ i <= n
guard $ even i
s' <- go (n-i) xs'
return (i : s')
Now threading the set around can be tedious. StateT transforms a type of computation to be stateful. [] is nondeterministic computation. StateT s [] is stateful nondeterministic computation, with state type s. Here the state will be the set of remaining elements.
Interestingly, pick can directly be interpreted as such a stateful computation. The intuition is that executing pickState removes an element from the state, which updates the state, and returns that element. pickState automatically fails if there are no more elements.
pickState :: StateT [(Int, Int)] [] Int
pickState = StateT pick
Then we repeatedly pick elements until we reach 0.
sumToState :: Int -> StateT [(Int, Int)] [] [Int]
sumToState = go
where
go 0 = return []
go n = do
i <- pickState
guard $ i <= n
guard $ even i
s' <- go (n-i)
return (i : s')
main = do
let n = 8
xs = [(1, 5), (4, 2), (2, 3)]
print $ sumTo n xs
print $ evalStateT (sumToState n) xs
(full source)
It's not much work to add the StateT monad transformer to your clean solution. You just need to add a layer of wrapping and unwrapping to lift the values into the StateT type, and then take them back out using evalStateT.
Your code would also benefit from internally using a more specialized type for the opts than [(Int, Int)]. MultiSet would be a good choice since it automatically manages occurrences.
Here's a tested example of what it could look like:
import Control.Monad.State (StateT, evalStateT, get, modify, lift, guard)
import Data.MultiSet (MultiSet, fromOccurList, distinctElems, delete)
sumToN :: Int -> [(Int, Int)] -> [[Int]]
sumToN nStart optsStart =
evalStateT (go nStart) (fromOccurList optsStart)
where
go :: Int -> StateT (MultiSet Int) [] [Int]
go 0 = return []
go n = do
val <- lift . distinctElems =<< get
guard (val <= n && even val)
modify (delete val)
(val:) <$> go (n - val)
λ> sumToN 8 [(1,5), (4,2), (2,3)]
[[2,2,4],[2,4,2],[4,2,2],[4,4]]
And actually, the StateT isn't benefiting us very much here. You could refactor it to take the MultiSet Int as a parameter and it would work just as well.
import Control.Monad (guard)
import Data.MultiSet (fromOccurList, distinctElems, delete)
sumToN :: Int -> [(Int, Int)] -> [[Int]]
sumToN nStart optsStart =
go nStart (fromOccurList optsStart)
where
go 0 _ = return []
go n opts = do
val <- distinctElems opts
guard (val <= n && even val)
(val:) <$> go (n - val) (delete val opts)
What I am wanting to do is create a list of random integers, with no duplicates. As a first step, I have a function which makes a list of n random samples. How does one write this in a more Haskell idiomatic way, where an empty list does not need to be passed in to start the list off? I am sure I am missing something basic and fundamental.
-- make a list of random integers.
-- takes a size, and an empty list.
-- returns a list of that length of random numbers.
f :: Int -> [Int] -> IO [Int]
f l xs | length xs >= l = return (xs)
f l xs = do
r <- randomRIO (1, 40) :: IO Int
f l $ r : x
Usage:
*Main> f 6 []
[10,27,33,35,31,28]
Ultimately this function will have filtering to check for duplicate insertions, but that is a separate question. Although this may look like homework, it is not, but part of my own attempt to come to grips with the State monad as used for random number generation, and finding I am stuck at a much earlier spot.
Well, you can operate on the output of the recursive call:
f :: Int -> IO [Int]
f 0 = return []
f n = do
r <- randomRIO (1, 40)
xs <- f (n-1)
return $ r : xs
Note however that it's important the the operation you perform on the result is fast. In this case r : xs is constant time. However if you replace the last line with (say):
return $ xs ++ [r]
this would change the complexity of the function from linear to quadratic because every ++ call will have to scan all the sequence of previously generated numbers before appending the new one.
However you could simply do:
f n = sequence $ replicate n (randomRIO (1, 40))
replicate creates a [IO Int] list of length n made of randomRIO actions and sequence takes an [IO a] and turns it into an IO [a] by executing all the actions in order and collecting the results.
Even simpler, you could use replicateM which is already the function you want:
import Control.Monad(replicateM)
f n = replicateM n (randomRIO (1, 40))
or in point-free style:
f :: Int -> IO [Int]
f = flip replicateM $ randomRIO (1, 40)
This uses a Set to keep track of numbers already generated:
import System.Random
import qualified Data.Set as Set
generateUniqueRandoms :: (Int, Int) -> Int -> IO [Int]
generateUniqueRandoms range#(low, high) n =
let maxN = min (high - low) n
in
go maxN Set.empty
where
go 0 _ = return []
go n s = do
r <- getUniqueRandom s
xs <- go (n-1) (Set.insert r s)
return $ r : xs
getUniqueRandom s = do
r <- randomRIO range
if (Set.member r s) then getUniqueRandom s
else return r
Here is some sample output:
Main> generateUniqueRandoms (1, 40) 23
[29,22,2,17,5,8,24,27,10,16,6,3,14,37,25,34,30,28,7,31,15,20,36]
Main> generateUniqueRandoms (1, 40) 1000
[33,35,24,16,13,1,26,7,14,11,15,2,4,30,28,6,32,25,38,22,17,12,20,5,18,40,36,39,27,9,37,31,21,29,8,34,10,23,3]
Main> generateUniqueRandoms (1, 40) 0
[]
However, it is worth noting that if n is close to the width of the range, it'd be much more efficient to shuffle a list of all numbers in the range and take the first n of that.
I recently learned about Data.Function.fix, and now I want to apply it everywhere. For example, whenever I see a recursive function I want to "fix" it. So basically my question is where and when should I use it.
To make it more specific:
1) Suppose I have the following code for factorization of n:
f n = f' n primes
where
f' n (p:ps) = ...
-- if p^2<=n: returns (p,k):f' (n `div` p^k) ps for k = maximum power of p in n
-- if n<=1: returns []
-- otherwise: returns [(n,1)]
If I rewrite it in terms of fix, will I gain something? Lose something? Is it possible, that by rewriting an explicit recursion into fix-version I will resolve or vice versa create a stack overflow?
2) When dealing with lists, there are several solutions: recursion/fix, foldr/foldl/foldl', and probably something else. Is there any general guide/advice on when to use each? For example, would you rewrite the above code using foldr over the infinite list of primes?
There are, probably, other important questions not covered here. Any additional comments related to the usage of fix are welcome as well.
One thing that can be gained by writing in an explicitly fixed form is that the recursion is left "open".
factOpen :: (Integer -> Integer) -> Integer -> Integer
factOpen recur 0 = 1
factOpen recur n = n * recur (pred n)
We can use fix to get regular fact back
fact :: Integer -> Integer
fact = fix factOpen
This works because fix effectively passes a function itself as its first argument. By leaving the recursion open, however, we can modify which function gets "passed back". The best example of using this property is to use something like memoFix from the memoize package.
factM :: Integer -> Integer
factM = memoFix factOpen
And now factM has built-in memoization.
Effectively, we have that open-style recursion requires us impute the recursive bit as a first-order thing. Recursive bindings are one way that Haskell allows for recursion at the language level, but we can build other, more specialized forms.
I'd like to mention another usage of fix; suppose you have a simple language consisting of addition, negative, and integer literals. Perhaps you have written a parser which takes a String and outputs a Tree:
data Tree = Leaf String | Node String [Tree]
parse :: String -> Tree
-- parse "-(1+2)" == Node "Neg" [Node "Add" [Node "Lit" [Leaf "1"], Node "Lit" [Leaf "2"]]]
Now you would like to evaluate your tree to a single integer:
fromTree (Node "Lit" [Leaf n]) = case reads n of {[(x,"")] -> Just x; _ -> Nothing}
fromTree (Node "Neg" [e]) = liftM negate (fromTree e)
fromTree (Node "Add" [e1,e2]) = liftM2 (+) (fromTree e1) (fromTree e2)
Suppose someone else decides to extend the language; they want to add multiplication. They will have to have access to the original source code. They could try the following:
fromTree' (Node "Mul" [e1, e2]) = ...
fromTree' e = fromTree e
But then Mul can only appear once, at the top level of the expression, since the call to fromTree will not be aware of the Node "Mul" case. Tree "Neg" [Tree "Mul" a b] will not work, since the original fromTree has no pattern for "Mul". However, if the same function is written using fix:
fromTreeExt :: (Tree -> Maybe Int) -> (Tree -> Maybe Int)
fromTreeExt self (Node "Neg" [e]) = liftM negate (self e)
fromTreeExt .... -- other cases
fromTree = fix fromTreeExt
Then extending the language is possible:
fromTreeExt' self (Node "Mul" [e1, e2]) = ...
fromTreeExt' self e = fromTreeExt self e
fromTree' = fix fromTreeExt'
Now, the extended fromTree' will evaluate the tree properly, since self in fromTreeExt' refers to the entire function, including the "Mul" case.
This approach is used here (the above example is a closely adapted version of the usage in the paper).
Beware the difference between _Y f = f (_Y f) (recursion, value--copying) and fix f = x where x = f x (corecursion, reference--sharing).
Haskell's let and where bindings are recursive: same name on the LHS and RHS refer to the same entity. The reference is shared.
In the definition of _Y there's no sharing (unless a compiler performs an aggressive optimization of common subexpressions elimination). This means it describes recursion, where repetition is achieved by application of a copy of an original, like in a classic metaphor of a recursive function creating its own copies. Corecursion, on the other hand, relies on sharing, on referring to same entity.
An example, primes calculated by
2 : _Y ((3:) . gaps 5 . _U . map (\p-> [p*p, p*p+2*p..]))
-- gaps 5 == ([5,7..] \\)
-- _U == sort . concat
either reusing its own output (with fix, let g = ((3:)...) ; ps = g ps in 2 : ps) or creating separate primes supply for itself (with _Y, let g () = ((3:)...) (g ()) in 2 : g ()).
See also:
double stream feed to prevent unneeded memoization?
How to implement an efficient infinite generator of prime numbers in Python?
Or, with the usual example of factorial function,
gen rec n = n<2 -> 1 ; n * rec (n-1) -- "if" notation
facrec = _Y gen
facrec 4 = gen (_Y gen) 4
= let {rec=_Y gen} in (\n-> ...) 4
= let {rec=_Y gen} in (4<2 -> 1 ; 4*rec 3)
= 4*_Y gen 3
= 4*gen (_Y gen) 3
= 4*let {rec2=_Y gen} in (3<2 -> 1 ; 3*rec2 2)
= 4*3*_Y gen 2 -- (_Y gen) recalculated
.....
fac = fix gen
fac 4 = (let f = gen f in f) 4
= (let f = (let {rec=f} in (\n-> ...)) in f) 4
= let {rec=f} in (4<2 -> 1 ; 4*rec 3) -- f binding is created
= 4*f 3
= 4*let {rec=f} in (3<2 -> 1 ; 3*rec 2)
= 4*3*f 2 -- f binding is reused
.....
1) fix is just a function, it improves your code when you use some recursion. It makes your code prettier.For example usage visit: Haskell Wikibook - Fix and recursion.
2) You know what does foldr? Seems like foldr isn't useful in factorization (or i didn't understand what are you mean in that).
Here is a prime factorization without fix:
fact xs = map (\x->takeWhile (\y->y/=[]) x) . map (\x->factIt x) $ xs
where factIt n = map (\x->getFact x n []) [2..n]
getFact i n xs
| n `mod` i == 0 = getFact i (div n i) xs++[i]
| otherwise = xs
and with fix(this exactly works like the previous):
fact xs = map (\x->takeWhile (\y->y/=[]) x) . map (\x->getfact x) $ xs
where getfact n = map (\x->defact x n) [2..n]
defact i n =
fix (\rec j k xs->if(mod k j == 0)then (rec j (div k j) xs++[j]) else xs ) i n []
This isn't pretty because in this case fix isn't a good choice(but there is always somebody who can write it better).
I'm totally new to Haskell so apologies if the question is silly.
What I want to do is recursively build a list while at the same time building up an accumulated value based on the recursive calls. This is for a problem I'm doing for a Coursera course, so I won't post the exact problem but something analogous.
Say for example I wanted to take a list of ints and double each one (ignoring for the purpose of the example that I could just use map), but I also wanted to count up how many times the number '5' appears in the list.
So to do the doubling I could do this:
foo [] = []
foo (x:xs) = x * 2 : foo xs
So far so easy. But how can I also maintain a count of how many times x is a five? The best solution I've got is to use an explicit accumulator like this, which I don't like as it reverses the list, so you need to do a reverse at the end:
foo total acc [] = (total, reverse acc)
foo total acc (x:xs) = foo (if x == 5 then total + 1 else total) (x*2 : acc) xs
But I feel like this should be able to be handled nicer by the State monad, which I haven't used before, but when I try to construct a function that will fit the pattern I've seen I get stuck because of the recursive call to foo. Is there a nicer way to do this?
EDIT: I need this to work for very long lists, so any recursive calls need to be tail-recursive too. (The example I have here manages to be tail-recursive thanks to Haskell's 'tail recursion modulo cons').
Using State monad it can be something like:
foo :: [Int] -> State Int [Int]
foo [] = return []
foo (x:xs) = do
i <- get
put $ if x==5 then (i+1) else i
r <- foo xs
return $ (x*2):r
main = do
let (lst,count) = runState (foo [1,2,5,6,5,5]) 0 in
putStr $ show count
This is a simple fold
foo :: [Integer] -> ([Integer], Int)
foo [] = ([], 0)
foo (x : xs) = let (rs, n) = foo xs
in (2 * x : rs, if x == 5 then n + 1 else n)
or expressed using foldr
foo' :: [Integer] -> ([Integer], Int)
foo' = foldr f ([], 0)
where
f x (rs, n) = (2 * x : rs, if x == 5 then n + 1 else n)
The accumulated value is a pair of both the operations.
Notes:
Have a look at Beautiful folding. It shows a nice way how to make such computations composable.
You can use State for the same thing as well, by viewing each element as a stateful computation. This is a bit overkill, but certainly possible. In fact, any fold can be expressed as a sequence of State computations:
import Control.Monad
import Control.Monad.State
-- I used a slightly non-standard signature for a left fold
-- for simplicity.
foldl' :: (b -> a -> a) -> a -> [b] -> a
foldl' f z xs = execState (mapM_ (modify . f) xs) z
Function mapM_ first maps each element of xs to a stateful computation by modify . f :: b -> State a (). Then it combines a list of such computations into one of type State a () (it discards the results of the monadic computations, just keeps the effects). Finally we run this stateful computation on z.
I've coded up the 0-1 Knapsack problem in Haskell. I'm fairly proud about the laziness and level of generality achieved so far.
I start by providing functions for creating and dealing with a lazy 2d matrix.
mkList f = map f [0..]
mkTable f = mkList (\i -> mkList (\j -> f i j))
tableIndex table i j = table !! i !! j
I then make a specific table for a given knapsack problem
knapsackTable = mkTable f
where f 0 _ = 0
f _ 0 = 0
f i j | ws!!i > j = leaveI
| otherwise = max takeI leaveI
where takeI = tableIndex knapsackTable (i-1) (j-(ws!!i)) + vs!!i
leaveI = tableIndex knapsackTable (i-1) j
-- weight value pairs; item i has weight ws!!i and value vs!!i
ws = [0,1,2, 5, 6, 7] -- weights
vs = [0,1,7,11,21,31] -- values
And finish off with a couple helper functions for looking at the table
viewTable table maxI maxJ = take (maxI+1) . map (take (maxJ+1)) $ table
printTable table maxI maxJ = mapM_ print $ viewTable table maxI maxJ
This much was pretty easy. But I want to take it a step further.
I want a better data structure for the table. Ideally, it should be
Unboxed (immutable) [edit] never mind this
Lazy
Unbounded
O(1) time to construct
O(1) time complexity for looking up a given entry,
(more realistically, at worst O(log n), where n is i*j for looking up the entry at row i, column j)
Bonus points if you can explain why/how your solution satisfies these ideals.
Also bonus points if you can further generalize knapsackTable, and prove that it is efficient.
In improving the data structure you should try to satisfy the following goals:
If I ask for the solution where the maximum weight is 10 (in my current code, that would be indexTable knapsackTable 5 10, the 5 means include items 1-5) only the minimal amount of work necessary should be performed. Ideally this means no O(i*j) work for forcing the spine of each row of the table to necessary column length. You could say this isn't "true" DP, if you believe DP means evaluating the entirety of the table.
If I ask for the entire table to be printed (something like printTable knapsackTable 5 10), the values of each entry should be computed once and only once. The values of a given cell should depend on the values of other cells (DP style: the idea being, never recompute the same subproblem twice)
Ideas:
Data.Array bounded :(
UArray strict :(
Memoization techniques (SO question about DP in Haskell) this might work
Answers that make some compromises to my stated ideals will be upvoted (by me, anyways) as long as they are informative. The answer with the least compromises will probably be the "accepted" one.
First, your criterion for an unboxed data structure is probably a bit mislead. Unboxed values must be strict, and they have nothing to do with immutability. The solution I'm going to propose is immutable, lazy, and boxed. Also, I'm not sure in what way you are wanting construction and querying to be O(1). The structure I'm proposing is lazily constructed, but because it's potentially unbounded, its full construction would take infinite time. Querying the structure will take O(k) time for any particular key of size k, but of course the value you're looking up may take further time to compute.
The data structure is a lazy trie. I'm using Conal Elliott's MemoTrie library in my code. For genericity, it takes functions instead of lists for the weights and values.
knapsack :: (Enum a, Num w, Num v, Num a, Ord w, Ord v, HasTrie a, HasTrie w) =>
(a -> w) -> (a -> v) -> a -> w -> v
knapsack weight value = knapsackMem
where knapsackMem = memo2 knapsack'
knapsack' 0 w = 0
knapsack' i 0 = 0
knapsack' i w
| weight i > w = knapsackMem (pred i) w
| otherwise = max (knapsackMem (pred i) w)
(knapsackMem (pred i) (w - weight i)) + value i
Basically, it's implemented as a trie with a lazy spine and lazy values. It's bounded only by the key type. Because the entire thing is lazy, its construction before forcing it with queries is O(1). Each query forces a single path down the trie and its value, so it's O(1) for a bounded key size O(log n). As I already said, it's immutable, but not unboxed.
It will share all work in the recursive calls. It doesn't actually allow you to print the trie directly, but something like this should not do any redundant work:
mapM_ (print . uncurry (knapsack ws vs)) $ range ((0,0), (i,w))
Unboxed implies strict and bounded. Anything 100% Unboxed cannot be Lazy or Unbounded. The usual compromise is embodied in converting [Word8] to Data.ByteString.Lazy where there are unboxed chunks (strict ByteString) which are linked lazily together in an unbounded way.
A much more efficient table generator (enhanced to track individual items) could be made using "scanl", "zipWith", and my "takeOnto". This effectively avoid using (!!) while creating the table:
import Data.List(sort,genericTake)
type Table = [ [ Entry ] ]
data Entry = Entry { bestValue :: !Integer, pieces :: [[WV]] }
deriving (Read,Show)
data WV = WV { weight, value :: !Integer }
deriving (Read,Show,Eq,Ord)
instance Eq Entry where
(==) a b = (==) (bestValue a) (bestValue b)
instance Ord Entry where
compare a b = compare (bestValue a) (bestValue b)
solutions :: Entry -> Int
solutions = length . filter (not . null) . pieces
addItem :: Entry -> WV -> Entry
addItem e wv = Entry { bestValue = bestValue e + value wv, pieces = map (wv:) (pieces e) }
-- Utility function for improve
takeOnto :: ([a] -> [a]) -> Integer -> [a] -> [a]
takeOnto endF = go where
go n rest | n <=0 = endF rest
| otherwise = case rest of
(x:xs) -> x : go (pred n) xs
[] -> error "takeOnto: unexpected []"
improve oldList wv#(WV {weight=wi,value = vi}) = newList where
newList | vi <=0 = oldList
| otherwise = takeOnto (zipWith maxAB oldList) wi oldList
-- Dual traversal of index (w-wi) and index w makes this a zipWith
maxAB e2 e1 = let e2v = addItem e2 wv
in case compare e1 e2v of
LT -> e2v
EQ -> Entry { bestValue = bestValue e1
, pieces = pieces e1 ++ pieces e2v }
GT -> e1
-- Note that the returned table is finite
-- The dependence on only the previous row makes this a "scanl" operation
makeTable :: [Int] -> [Int] -> Table
makeTable ws vs =
let wvs = zipWith WV (map toInteger ws) (map toInteger vs)
nil = repeat (Entry { bestValue = 0, pieces = [[]] })
totW = sum (map weight wvs)
in map (genericTake (succ totW)) $ scanl improve nil wvs
-- Create specific table, note that weights (1+7) equal weight 8
ws, vs :: [Int]
ws = [2,3, 5, 5, 6, 7] -- weights
vs = [1,7,8,11,21,31] -- values
t = makeTable ws vs
-- Investigate table
seeTable = mapM_ seeBestValue t
where seeBestValue row = mapM_ (\v -> putStr (' ':(show (bestValue v)))) row >> putChar '\n'
ways = mapM_ seeWays t
where seeWays row = mapM_ (\v -> putStr (' ':(show (solutions v)))) row >> putChar '\n'
-- This has two ways of satisfying a bestValue of 8 for 3 items up to total weight 5
interesting = print (t !! 3 !! 5)
Lazy storable vectors: http://hackage.haskell.org/package/storablevector
Unbounded, lazy, O(chunksize) time to construct, O(n/chunksize) indexing, where chunksize can be sufficiently large for any given purpose. Basically a lazy list with some significant constant factor benifits.
To memoize functions, I recommend a library like Luke Palmer's memo combinators. The library uses tries, which are unbounded and have O(key size) lookup. (In general, you can't do better than O(key size) lookup because you always have to touch every bit of the key.)
knapsack :: (Int,Int) -> Solution
knapsack = memo f
where
memo = pair integral integral
f (i,j) = ... knapsack (i-b,j) ...
Internally, the integral combinator probably builds an infinite data structure
data IntTrie a = Branch IntTrie a IntTrie
integral f = \n -> lookup n table
where
table = Branch (\n -> f (2*n)) (f 0) (\n -> f (2*n+1))
Lookup works like this:
lookup 0 (Branch l a r) = a
lookup n (Branch l a r) = if even n then lookup n2 l else lookup n2 r
where n2 = n `div` 2
There are other ways to build infinite tries, but this one is popular.
Why won't you use Data.Map putting the other Data.Map into it? As far as I know it's quite fast.
It wouldn't be lazy though.
More than that, you can implement Ord typeclass for you data
data Index = Index Int Int
and put a two dimensional index directly as a key. You can achieve laziness by generating this map as a list and then just use
fromList [(Index 0 0, value11), (Index 0 1, value12), ...]