Lagrange Interpolation for a schema based on Shamir's Secret Sharing - haskell

I'm trying to debug an issue with an implementation of a threshold encryption scheme. I've posted this question on crypto to get some help with the actual scheme but was hoping to get a sanity check on the simplified code I am using.
Essentially the the crypto system uses Shamir's Secret Sharing to combine the shares of a key. The polynomial is each member of the list 'a' multiplied by a increasing power of the parameter of the polynomial. I've left out the mod by prime to simplify the code as the actual implementation uses PBC via a Haskell wrapper.
I have for the polynomial
poly :: [Integer] -> Integer -> Integer
poly as xi = (f 1 as)
where
f _ [] = 0
f 0 _ = 0
f s (a:as) = (a * s) + f (s * xi) as
The Lagrange interpolation is:
interp0 :: [(Integer, Integer)] -> Integer
interp0 xys = round (sum $ zipWith (*) ys $ fmap (f xs) xs)
where
xs = map (fromIntegral .fst) xys
ys = map (fromIntegral .snd) xys
f :: (Eq a, Fractional a) => [a] -> a -> a
f xs xj = product $ map (p xj) xs
p :: (Eq a, Fractional a) => a -> a -> a
p xj xm = if xj == xm then 1 else negate (xm / (xj - xm))
and the split and combination code is
execPoly as#(a0:_) = do
let xs = zipWith (,) [0..] (fmap (poly as) [0..100])
let t = length as + 1
let offset = 1
let shares = take t (drop offset xs)
let sm2 = interp0 shares
putText ("poly and interp over " <> show as <> " = " <> show sm2 <> ". Should be " <> show a0)
main :: IO ()
main = do
execPoly [10,20,30,40,50,60,70,80,90,100,110,120,130,140,150] --1
execPoly [10,20,30,40,50,60,70,80] -- 2
execPoly(1) fails to combine to 10 but execPoly(2) combines correctly. The magic threshold seems to be 8.
Is my code correct? I am missing something in the implementation that limits the threshold size to 8?

As MathematicalOrchid said it was a precision problem.
Updated the code to:
f :: (Eq a, Integral a) => [a] -> a -> Ratio a
f xs xj = product $ map (p xj) xs
p :: (Eq a, Integral a)=> a -> a -> Ratio a
p xj xm = if xj == xm then (1 % 1) else (negate xm) % (xj - xm)
And it works as expected.

Related

Fold that's both constant-space and short-circuiting

I'm trying to build a Haskell function that does basically the same thing as Prelude's product. Unlike that function, however, it should have these two properties:
It should operate in constant space (ignoring the fact that some numeric types like Integer aren't). For example, I want myProduct (replicate 100000000 1) to eventually return 1, unlike Prelude's product which uses up all of my RAM and then gives *** Exception: stack overflow.
It should short-circuit when it encounters a 0. For example, I want myProduct (0:undefined) to return 0, unlike Prelude's product which gives *** Exception: Prelude.undefined.
Here's what I've come up with so far:
myProduct :: (Eq n, Num n) => [n] -> n
myProduct = go 1
where go acc (x:xs) = if x == 0 then 0 else acc `seq` go (acc * x) xs
go acc [] = acc
That works exactly how I want it to for lists, but I'd like to generalize it to have type (Foldable t, Eq n, Num n) => t n -> n. Is it possible to do this with any of the folds? If I just use foldr, then it will short-circuit but won't be constant-space, and if I just use foldl', then it will be constant-space but won't short-circuit.
If you spell your function slightly differently, it's more obvious how to turn it into a foldr. Namely:
myProduct :: (Eq n, Num n) => [n] -> n
myProduct = flip go 1 where
go (x:xs) = if x == 0 then \acc -> 0 else \acc -> acc `seq` go xs (acc * x)
go [] = \acc -> acc
Now go has got that foldr flavor, and we can just fill in the holes.
myProduct :: (Foldable t, Eq n, Num n) => t n -> n
myProduct = flip go 1 where
go = foldr
(\x f -> if x == 0 then \acc -> 0 else \acc -> acc `seq` f (acc * x))
(\acc -> acc)
Hopefully you can see where each of those pieces came from in the previous explicit-recursion style and how mechanical the transformation is. Then I'd make a few aesthetic tweaks:
myProduct :: (Foldable t, Eq n, Num n) => t n -> n
myProduct xs = foldr step id xs 1 where
step 0 f acc = 0
step x f acc = f $! acc * x
And we're all done! A bit of quick testing in ghci reveals that it still short-circuits on 0 as required and uses constant space when specialized to lists.
You might be looking for foldM. Instantiate it with m = Either b and you get short circuiting behavior (or Maybe, depends if you have many possible early exit values, or one known in advance).
foldM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
I recall discussions whether there should be foldM', but IIRC GHC does the right thing most of the time.
import Control.Monad
import Data.Maybe
myProduct :: (Foldable t, Eq n, Num n) => t n -> n
myProduct = fromMaybe 0 . foldM go 1
where go acc x = if x == 0 then Nothing else Just $! acc * x

Function Type Restrictions

Is it generally preferable to have the strictest or loosest type definition for a function? What are the pros and cons of each approach? I found that when I rewrote my pearson correlation code using strictly doubles, it was easier for me to write, follow, and reason about (this could just be inexperience). But I can also see how having a more broad type definition would make the functions more generally applicable. Would stricter type definitions be characterized as a form of tech debt?
With Typeclasses:
import Data.List
mean :: Fractional a => [a] -> a
mean xs = s / n
where
(s , n) = foldl' k (0,0) xs
k (s, n) x = s `seq` n `seq` (s + x, n + 1)
covariance :: Fractional a => [a] -> [a] -> a
covariance xs ys = mean productXY
where
productXY = zipWith (*) [x - mx | x <- xs] [y - my | y <- ys]
mx = mean xs
my = mean ys
stddev :: Floating a => [a] -> a
stddev xs = sqrt (covariance xs xs)
pearson :: RealFloat a => [a] -> [a] -> a
pearson x y = fifthRound $ covariance x y / (stddev x * stddev y)
pearsonMatrix :: RealFloat a => [[a]] -> [[a]]
pearsonMatrix (x:xs) = [pearson x y | y <- x:xs]:(pearsonMatrix xs)
pearsonMatrix [] = []
fifthRound :: RealFrac a => a -> a
fifthRound x = (/100000) $ fromIntegral $ round (x * 100000)
With Doubles:
import Data.List
mean :: [Double] -> Double
mean xs = s / n
where
(s , n) = foldl' k (0,0) xs
k (s, n) x = s `seq` n `seq` (s + x, n + 1)
covariance :: [Double] -> [Double] -> Double
covariance xs ys = mean productXY
where
productXY = zipWith (*) [x - mx | x <- xs] [y - my | y <- ys]
mx = mean xs
my = mean ys
stddev :: [Double] -> Double
stddev xs = sqrt (covariance xs xs)
pearson :: [Double] -> [Double] -> Double
pearson x y = fifthRound (covariance x y / (stddev x * stddev y))
pearsonMatrix :: [[Double]] -> [[Double]]
pearsonMatrix (x:xs) = [pearson x y | y <- x:xs]:(pearsonMatrix xs)
pearsonMatrix [] = []
fifthRound :: Double -> Double
fifthRound x = (/100000) $ fromIntegral $ round (x * 100000)
Readability is a matter of opinion. In general, I find that more general type signatures are more readable because there are fewer possible definitions (sometimes there is even only one non-diverging definition). For example, seeing that mean only has a Fractional constraint immediately limits the operations being performed in that function (compared to the Double version which could be performing sqrt operations for all I know). Of course, generalizing types is not always more readable. (And just for fun)
The main disadvantage of having more general versions of functions is that they may remain unoptimized at runtime so that Double's dictionary of the Floating functions has to be passed to mean every time it is called.
You can have the best of all worlds by adding a SPECIALIZE pragma. This tells the compiler to basically duplicate your function code with some of the type variables instantiated. If you know you are going to be calling your mean function pretty much only with Double, then this is what I would do
{-# SPECIALIZE mean :: [Double] -> Double #-}
mean :: Fractional a => [a] -> a
mean xs = s / n
where
(s , n) = foldl' k (0,0) xs
k (s, n) x = s `seq` n `seq` (s + x, n + 1)
And you get to see the specialized version of the signature in your code too! Yay!

Undefined error on defined function

I'm trying to implement a function that multiplies polynomials (represented using lists -- 3x^2 + 5x + 2 = P [2,5,3]):
newtype Poly a = P [a]
plus :: Num a => Poly a -> Poly a -> Poly a
plus (P a) (P b) = P (map (\(y,z) -> z + y) (zipWithPadding 0 a b))
where
zipWithPadding :: (Num a) => a -> [a] -> [a] -> [(a, a)]
zipWithPadding e (aa: as) (bb: bs) = ((aa, bb): zipWithPadding e as bs)
zipWithPadding e [] bs = zip (repeat e) bs
zipWithPadding e as [] = zip as (repeat e)
times :: Num a => Poly a -> Poly a -> Poly a
times (P a) (P b) = sum $ multList 0 [] a b
where
multList :: Num a => Int -> [Poly a] -> [a] -> [a] -> [Poly a]
multList _ s [] _ = s
multList e s (aa:as) bs = multList (e + 1) (s ++ (multElement e aa bs)) as bs
multElement :: Num a => Int -> a -> [a] -> [Poly a]
multElement e aa bs = [P $ replicate e 0 ++ (map (*aa) bs)]
instance Num a => Num (Poly a) where
(+) = plus
(*) = times
negate = undefined
fromInteger = undefined
-- No meaningful definitions exist
abs = undefined
signum = undefined
When I tried to run however, I got an undefined error:
*HW04> times (P [1,2,2]) (P [1,2])
*** Exception: Prelude.undefined
I'm confused.
Clearly you are calling one of the undefined methods in the Num instance for Poly.
You can determine which one is being called by using these definitions:
negate = error "Poly negate undefined"
fromInteger = error "Poly fromInteger undefined"
abs = error "Poly abs undefined"
signum = error "Poly signum undefined"
Running your test expression yields:
Poly *** Exception: Poly fromInteger undefined
The problem is in your use of sum which is essentially defined as:
sum xs = foldl (+) 0 xs
It is therefore calling fromInteger 0. You can fix this with:
fromInteger x = P [ fromInteger x ]
Update
The reason fromInteger for Poly a needs to be defined this way is
because we need to construct a list of Num a values, and fromInteger x
is the way to create a Num a from the Integer value x.
A polynomial is not really a Num, although there is a ring monomorphism Num a => a -> Poly a.
Discard that Num instance and use foldl plus instead of sum.
I'm going to take the position that you should not define an instance of a class simply to hijack the class's functions. The minimal definition of a Num instance expects certain functions to be defined; explicitly assigning undefined to those names does not qualify as a definition. Consider that Haskell provides a specific operator (++) for list concatenation instead of simply overloading (+) with an instance like
instance Num [a] where
a + [] = a
[] + b = b
(a:as) + b = a:(as + b)
(*) = undefined
negate = undefined
-- etc
Instead, define a class that does provide the operations you want. In this case, you want a Ring, which is a type along with two operations, addition and multipication, that obey certain laws. (Put briefly, the operations act as you would expect given the integers as an example, except multiplication is not required to be commutative.)
In Haskell, we would define the class as
class Ring a where
rplus :: a -> a -> a -- addition
rmult :: a -> a -> a -- multiplication
rnegate :: a -> a -- negation
runit :: a -- multiplicative identity
rzero :: a -- additive identity, multiplicative zero
Any value with a valid Num instance forms a ring, although you need to define the instances separately.
instance Ring Integer where
rplus = (+)
rmult = (*)
rnegate = negate
rzero = 0
runit = 1
instance Ring Float
rplus = (+)
rmult = (*)
rnegate = negate
rzero = 0
runit = 1
-- etc
You can define an instance of Ring for polynomials, as long as the coefficients form a ring as well.
newtype Poly a = P [a]
instance Ring a => Ring (Poly a) where
-- Take care to handle polynomials with different degree
-- Note the use of rplus and rzero instead of (+) and 0
-- when dealing with coefficients
rplus (P a) (P b) = case (compare (length a) (length b)) of
LT -> rplus (P (rzero:a)) (P b)
EQ -> P $ zipWith rplus a b
GT -> rplus (P a) (P (rzero:b))
-- I leave a correct implementation of rmult as an exercise
-- for the reader.
rmult = ...
rnegate (P coeffs) = P $ map rnegate coeffs
rzero = P [0]
runit = P [1]

Haskell - MinMax using foldr

I am looking for a Haskell function that takes a list as an argument and returns a tuple (min, max), where min is the minimal value of the list and max is the maximal value.
I already have this:
maxMinFold :: Ord a => [a] -> (a, a)
maxMinFold list = foldr (\x (tailMin, tailMax) -> (min x tailMin) (max x tailMax)) -- missing part
Could you help me what to add to the missing part? (or tell me what I am doing wrong)
Thanks a lot
You take the head and use that as the fist min and max and then fold over the tail.
maxMinFold :: Ord a => [a] -> (a, a)
maxMinFold (x:xs) = foldr (\x (tailMin, tailMax) -> (min x tailMin, max x tailMax)) (x,x) xs
As regards your answer, your fold function is not returning the right type.
Note that
foldr :: (a -> b **-> b**) -> b -> [a] -> b
In particular you need to be returning a b, which is a tuple in your case
Since you always have to traverse the whole list to find the minimum and the maximum here is the solution with foldl:
maxMinList :: Ord a => [a] -> (a,a)
maxMinList (x:xs) = foldl (\(l,h) y -> (min l y, max h y)) (x,x) xs
To do this efficiently with foldr,
data NEList a = NEList a [a]
-- deriving (Eq, Ord, Show, Read, Functor, Foldable, Traversable)
minMax :: Ord a => NEList -> (a, a)
minMax (NEList x0 xs) = foldr go (,) xs x0 x0 where
go x r mn mx
| x < mn = r x mx
| mx < x = r mn x
| otherwise = r mn mx
Another, similar, approach:
minMaxM :: Ord a => [a] -> Maybe (a, a)
minMaxM xs = foldr go id xs Nothing where
go x r Nothing = r (Just (x, x))
go x r mnmx#(Just (mn, mx))
| x < mn = r (Just (x, mx))
| mx < x = r (Just (mn, x))
| otherwise = r mnmx
It would be nice if the minMax function returned Nothing in the case of an empty list. Here is a version which does that.
import Control.Arrow
import Data.Maybe
import Data.Foldable
minMax :: (Ord a) => [a] -> Maybe (a,a)
minMax = foldl' (flip $ \ x -> Just . maybe (x,x) (min x *** max x)) Nothing
This uses foldl' instead of foldr.

Memoizing a function of type [Integer] -> a

My problem is how to efficiently memoize an expensive function f :: [Integer] -> a that is defined for all finite lists of integers and has the property f . sort = f?
My typical use case is that given a list as of integers I need to obtain the values f (a:as) for various Integer a, so I'd like to build up simultaneously a directed labelled graph whose vertices are pairs of an Integer list and its function value. An edge labelled by a from (as, f as) to (bs, f bs) exists if and only if a:as = bs.
Stealing from a brilliant answer by Edward Kmett I simply copied
{-# LANGUAGE BangPatterns #-}
data Tree a = Tree (Tree a) a (Tree a)
instance Functor Tree where
fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r)
index :: Tree a -> Integer -> a
index (Tree _ m _) 0 = m
index (Tree l _ r) n = case (n - 1) `divMod` 2 of
(q,0) -> index l q
(q,1) -> index r q
nats :: Tree Integer
nats = go 0 1
where go !n !s = Tree (go l s') n (go r s')
where l = n + s
r = l + s
s' = s * 2
and adapted his idea to my problem as
-- directed graph labelled by Integers
data Graph a = Graph a (Tree (Graph a))
instance Functor Graph where
fmap f (Graph a t) = Graph (f a) (fmap (fmap f) t)
-- walk the graph following the given labels
walk :: Graph a -> [Integer] -> a
walk (Graph a _) [] = a
walk (Graph _ t) (x:xs) = walk (index t x) xs
-- graph of all finite integer sequences
intSeq :: Graph [Integer]
intSeq = Graph [] (fmap (\n -> fmap (n:) intSeq) nats)
-- could be replaced by Data.Strict.Pair
data StrictPair a b = StrictPair !a !b
deriving Show
-- f = sum modified according to Edward's idea (the real function is more complicated)
g :: ([Integer] -> StrictPair Integer [Integer]) -> [Integer] -> StrictPair Integer [Integer]
g mf [] = StrictPair 0 []
g mf (a:as) = StrictPair (a+x) (a:as)
where StrictPair x y = mf as
g_graph :: Graph (StrictPair Integer [Integer])
g_graph = fmap (g g_m) intSeq
g_m :: [Integer] -> StrictPair Integer [Integer]
g_m = walk g_graph
This works OK, but as the function f is independent of the order of the occurring integers (but not of their counts) there should be only one vertex in the graph for all integer lists equal up to ordering.
How do I achieve this?
How about just defining g_m' = g_m . sort, i.e. you simply sort the input list first before calling your memoized function?
I have a feeling this is the best you can do since if you want your memoized graph to consist of only sorted paths someone is going to have to look at all of the elements of the list before constructing the path.
Depending on what your input lists look like it might be helpful to transform them in a way which makes the trees branch less. For instance, you might try sorting and taking differences:
original input list: [8,3,14,8,5]
sorted: [3,3,8,8,14]
diffed: [3,0,5,0,6] -- use this as the key
The transformation is a bijection, and the trees branch less because there are smaller numbers involved.
You can use a bit different approach.
There is a trick in proof that a finite product of countable sets is countable:
We can map the sequence [a1, ..., an] to Nat by product . zipWith (^) primes: 2 ^ a1 * 3 ^ a2 * 5 ^ a3 * ... * primen ^ an.
To avoid problems with sequences with zero at the end, we can increase the last index.
As the sequence is ordered, we can exploit the property as user5402 mentioned.
The benefit of using the tree, is that you can increase branching to speed-up traversal. OTOH prime trick could make indexes quite big, but hopefully some tree paths will just be unexplored (remain as thunks).
{-# LANGUAGE BangPatterns #-}
-- Modified from Kmett's answer:
data Tree a = Tree a (Tree a) (Tree a) (Tree a) (Tree a)
instance Functor Tree where
fmap f (Tree x a b c d) = Tree (f x) (fmap f a) (fmap f b) (fmap f c) (fmap f d)
index :: Tree a -> Integer -> a
index (Tree x _ _ _ _) 0 = x
index (Tree _ a b c d) n = case (n - 1) `divMod` 4 of
(q,0) -> index a q
(q,1) -> index b q
(q,2) -> index c q
(q,3) -> index d q
nats :: Tree Integer
nats = go 0 1
where
go !n !s = Tree n (go a s') (go b s') (go c s') (go d s')
where
a = n + s
b = a + s
c = b + s
d = c + s
s' = s * 4
toList :: Tree a -> [a]
toList as = map (index as) [0..]
-- Primes -- https://www.haskell.org/haskellwiki/Prime_numbers
-- Generation and factorisation could be done much better
minus (x:xs) (y:ys) = case (compare x y) of
LT -> x : minus xs (y:ys)
EQ -> minus xs ys
GT -> minus (x:xs) ys
minus xs _ = xs
primes = 2 : sieve [3..] primes
where
sieve xs (p:ps) | q <- p*p , (h,t) <- span (< q) xs =
h ++ sieve (t `minus` [q, q+p..]) ps
addToLast :: [Integer] -> [Integer]
addToLast [] = []
addToLast [x] = [x + 1]
addToLast (x:xs) = x : addToLast xs
subFromLast :: [Integer] -> [Integer]
subFromLast [] = []
subFromLast [x] = [x - 1]
subFromLast (x:xs) = x : subFromLast xs
addSubProp :: [NonNegative Integer] -> Property
addSubProp xs = xs' === subFromLast (addToLast xs')
where xs' = map getNonNegative xs
-- Trick from user5402 answer
toDiffList :: [Integer] -> [Integer]
toDiffList = toDiffList' 0
where toDiffList' _ [] = []
toDiffList' p (x:xs) = x - p : toDiffList' x xs
fromDiffList :: [Integer] -> [Integer]
fromDiffList = fromDiffList' 0
where fromDiffList' _ [] = []
fromDiffList' p (x:xs) = p + x : fromDiffList' (x + p) xs
diffProp :: [Integer] -> Property
diffProp xs = xs === fromDiffList (toDiffList xs)
listToInteger :: [Integer] -> Integer
listToInteger = product . zipWith (^) primes . addToLast
integerToList :: Integer -> [Integer]
integerToList = subFromLast . impl primes 0
where impl _ _ 0 = []
impl _ 0 1 = []
impl _ k 1 = [k]
impl (p:ps) k n = case n `divMod` p of
(n', 0) -> impl (p:ps) (k + 1) n'
(_, _) -> k : impl ps 0 n
listProp :: [NonNegative Integer] -> Property
listProp xs = xs' === integerToList (listToInteger xs')
where xs' = map getNonNegative xs
toIndex :: [Integer] -> Integer
toIndex = listToInteger . toDiffList
fromIndex :: Integer -> [Integer]
fromIndex = fromDiffList . integerToList
-- [1,0] /= [0]
-- Decreasing sequence!
doesntHold :: [NonNegative Integer] -> Property
doesntHold xs = xs' === fromIndex (toIndex xs')
where xs' = map getNonNegative xs
holds :: [NonNegative Integer] -> Property
holds xs = xs' === fromIndex (toIndex xs')
where xs' = sort $ map getNonNegative xs
g :: ([Integer] -> Integer) -> [Integer] -> Integer
g mg = g' . sort
where g' [] = 0
g' (x:xs) = x + sum (map mg $ tails xs)
g_tree :: Tree Integer
g_tree = fmap (g faster_g' . fromIndex) nats
faster_g' :: [Integer] -> Integer
faster_g' = index g_tree . toIndex
faster_g = faster_g' . sort
On my machine fix g [1..22] feels slow, when faster_g [1..40] is still blazing fast.
Addition: if we have bounded set (with indexes 0..n-1) , we can encode it as: a0 * n^0 + a1 * n^1 ....
We can encode any Integer as binary list, e.g. 11 is [1, 1, 0, 1] (least bit first).
Then if we separate integers in the list with 2, we get sequence of bounded values.
As bonus we can take the sequence of 0, 1, 2 digits and compress it to binary using e.g. Huffman encoding, as 2 is much rarer than 0 or 1. But this might be overkill.
With this trick, indexes stay much smaller and the space probably is better packed.
{-# LANGUAGE BangPatterns #-}
-- From Kment's answer:
import Data.Function (fix)
import Data.List (sort, tails)
import Data.List.Split (splitOn)
import Test.QuickCheck
{-- Tree definition as before --}
-- 0, 1, 2
newtype N3 = N3 { unN3 :: Integer }
deriving (Eq, Show)
instance Arbitrary N3 where
arbitrary = elements $ map N3 [ 0, 1, 2 ]
-- Integer <-> N3
coeffs3 :: [Integer]
coeffs3 = coeffs' 1
where coeffs' n = n : coeffs' (n * 3)
listToInteger :: [N3] -> Integer
listToInteger = sum . zipWith f coeffs3
where f n (N3 m) = n * m
listFromInteger :: Integer -> [N3]
listFromInteger 0 = []
listFromInteger n = case n `divMod` 3 of
(q, m) -> N3 m : listFromInteger q
listProp :: [N3] -> Property
listProp xs = (null xs || last xs /= N3 0) ==> xs === listFromInteger (listToInteger xs)
-- Integer <-> N2
-- 0, 1
newtype N2 = N2 { unN2 :: Integer }
deriving (Eq, Show)
coeffs2 :: [Integer]
coeffs2 = coeffs' 1
where coeffs' n = n : coeffs' (n * 2)
integerToBin :: Integer -> [N2]
integerToBin 0 = []
integerToBin n = case n `divMod` 2 of
(q, m) -> N2 m : integerToBin q
integerFromBin :: [N2] -> Integer
integerFromBin = sum . zipWith f coeffs2
where f n (N2 m) = n * m
binProp :: NonNegative Integer -> Property
binProp (NonNegative n) = n === integerFromBin (integerToBin n)
-- unsafe!
n3ton2 :: N3 -> N2
n3ton2 = N2 . unN3
n2ton3 :: N2 -> N3
n2ton3 = N3 . unN2
-- [Integer] <-> [N3]
integerListToN3List :: [Integer] -> [N3]
integerListToN3List = concatMap (++ [N3 2]) . map (map n2ton3 . integerToBin)
integerListFromN3List :: [N3] -> [Integer]
integerListFromN3List = init . map (integerFromBin . map n3ton2) . splitOn [N3 2]
n3ListProp :: [NonNegative Integer] -> Property
n3ListProp xs = xs' === integerListFromN3List (integerListToN3List xs')
where xs' = map getNonNegative xs
-- Trick from user5402 answer
-- Integer <-> Sorted Integer
toDiffList :: [Integer] -> [Integer]
toDiffList = toDiffList' 0
where toDiffList' _ [] = []
toDiffList' p (x:xs) = x - p : toDiffList' x xs
fromDiffList :: [Integer] -> [Integer]
fromDiffList = fromDiffList' 0
where fromDiffList' _ [] = []
fromDiffList' p (x:xs) = p + x : fromDiffList' (x + p) xs
diffProp :: [Integer] -> Property
diffProp xs = xs === fromDiffList (toDiffList xs)
---
toIndex :: [Integer] -> Integer
toIndex = listToInteger . integerListToN3List . toDiffList
fromIndex :: Integer -> [Integer]
fromIndex = fromDiffList . integerListFromN3List . listFromInteger
-- [1,0] /= [0]
-- Decreasing sequence! doesn't terminate in this case
doesntHold :: [NonNegative Integer] -> Property
doesntHold xs = xs' === fromIndex (toIndex xs')
where xs' = map getNonNegative xs
holds :: [NonNegative Integer] -> Property
holds xs = xs' === fromIndex (toIndex xs')
where xs' = sort $ map getNonNegative xs
g :: ([Integer] -> Integer) -> [Integer] -> Integer
g mg = g' . sort
where g' [] = 0
g' (x:xs) = x + sum (map mg $ tails xs)
g_tree :: Tree Integer
g_tree = fmap (g faster_g' . fromIndex) nats
faster_g' :: [Integer] -> Integer
faster_g' = index g_tree . toIndex
faster_g = faster_g' . sort
Second addition:
I quickly benchmarked graph and binary sequence approach for my g with:
main :: IO ()
main = do
n <- read . head <$> getArgs
print $ faster_g [100, 110..n]
And the results are:
% time ./IntegerMemo 1000
1225560638892526472150132981770
./IntegerMemo 1000 0.19s user 0.01s system 98% cpu 0.200 total
% time ./IntegerMemo 2000
3122858113354873680008305238045814042010921833620857170165770
./IntegerMemo 2000 1.83s user 0.05s system 99% cpu 1.888 total
% time ./IntegerMemo 2500
4399449191298176980662410776849867104410434903220291205722799441218623242250
./IntegerMemo 2500 3.74s user 0.09s system 99% cpu 3.852 total
% time ./IntegerMemo 3000
5947985907461048240178371687835977247601455563536278700587949163642187584269899171375349770
./IntegerMemo 3000 6.66s user 0.13s system 99% cpu 6.830 total
% time ./IntegerMemoGrap 1000
1225560638892526472150132981770
./IntegerMemoGrap 1000 0.10s user 0.01s system 97% cpu 0.113 total
% time ./IntegerMemoGrap 2000
3122858113354873680008305238045814042010921833620857170165770
./IntegerMemoGrap 2000 0.97s user 0.04s system 98% cpu 1.028 total
% time ./IntegerMemoGrap 2500
4399449191298176980662410776849867104410434903220291205722799441218623242250
./IntegerMemoGrap 2500 2.11s user 0.08s system 99% cpu 2.202 total
% time ./IntegerMemoGrap 3000
5947985907461048240178371687835977247601455563536278700587949163642187584269899171375349770
./IntegerMemoGrap 3000 3.33s user 0.09s system 99% cpu 3.452 total
Looks like that graph version is faster by constant factor of 2. But they seem to have same time complexity :)
Looks like my problem is solved by simply replacing intSeq in the definition of g_graph by a monotone version:
-- replace vertexes for non-monotone integer lists by the according monotone one
monoIntSeq :: Graph [Integer]
monoIntSeq = f intSeq
where f (Graph as t) | as == sort as = Graph as $ fmap f t
| otherwise = fetch monIntSeq $ sort as
-- extract the subgraph after following the given labels
fetch :: Graph a -> [Integer] -> Graph a
fetch g [] = g
fetch (Graph _ t) (x:xs) = fetch (index t x) xs
g_graph :: Graph (StrictPair Integer [Integer])
g_graph = fmap (g g_m) monoIntSeq
Many thanks to all (especially user5402 and Oleg) for the help!
Edit: I still have the problem that the memory consumption is to high for my typical use case which can be described by following a path like this:
p :: [Integer]
p = map f [1..]
where f n | n `mod` 6 == 0 = n `div` 6
| n `mod` 3 == 0 = n `div` 3
| n `mod` 2 == 0 = n `div` 2
| otherwise = n
A slight improvement is to define the monotone integer sequences directly like this:
-- extract the subgraph after following the given labels (right to left)
fetch :: Graph a -> [Integer] -> Graph a
fetch = foldl' step
where step (Graph _ t) n = index t n
-- walk the graph following the given labels (right to left)
walk :: Graph a -> [Integer] -> a
walk g ns = a
where Graph a _ = fetch g ns
-- all monotone falling integer sequences
monoIntSeqs :: Graph [Integer]
monoIntSeqs = Graph [] $ fmap (flip f monoIntSeqs) nats
where f n (Graph ns t) | null ns = Graph (n:ns) $ fmap (f n) t
| n >= head ns = Graph (n:ns) $ fmap (f n) t
| otherwise = fetch monoIntSeqs (insert' n ns)
insert' = insertBy (comparing Down)
But at the end I might just use the original integer sequences without identification, identify nodes now and then explicitly and avoid keeping a reference to g_graph etc to let the garbage collection clean up as the program proceeds.
Reading the functional pearl Trouble Shared is Trouble Halved by Richard Bird and Ralf Hinze, I understood how to implement, what I was looking for two years ago (again based on Edward Kmett's trick):
{-# LANGUAGE BangPatterns #-}
import Data.Function (fix)
data Tree a = Tree (Tree a) a (Tree a)
deriving Show
instance Functor Tree where
fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r)
index :: Tree a -> Integer -> a
index (Tree _ m _) 0 = m
index (Tree l _ r) n = case (n - 1) `divMod` 2 of
(q,0) -> index l q
(q,1) -> index r q
nats :: Tree Integer
nats = go 0 1
where go !n !s = Tree (go l s') n (go r s')
where l = n + s
r = l + s
s' = s * 2
data IntSeqTree a = IntSeqTree a (Tree (IntSeqTree a))
val :: IntSeqTree a -> a
val (IntSeqTree a _) = a
step :: Integer -> IntSeqTree t -> IntSeqTree t
step n (IntSeqTree _ ts) = index ts n
intSeqTree :: IntSeqTree [Integer]
intSeqTree = fix $ create []
where create p x = IntSeqTree p $ fmap (extend x) nats
extend x n = case span (>n) (val x) of
([], p) -> fix $ create (n:p)
(m, p) -> foldr step intSeqTree (m ++ n:p)
instance Functor IntSeqTree where
fmap f (IntSeqTree a t) = IntSeqTree (f a) (fmap (fmap f) t)
In my use case I have hundreds or thousands of similar integer sequences (of length few hundred entries) that are generated incrementally. So for me this way is cheaper than sorting the sequences before looking up the function value (which I will access by using fmap on intSeqTree).

Resources