Stack overflow when constructing/evaluating a red black tree in Haskell

Stack overflow when constructing/evaluating a red black tree in Haskell - haskell

I have the following Red Black tree:
data Tree a
= E
| S a
| C !Color !(Tree a) !(Tree a)
data Color = R | B
In case of this tree, all the data are stored in the leaves (the S constructor). I have written an insert function like the standard Okasaki red black trees[1] (modifying the parts where the values are stored in the internal nodes)
In this cases I populate the tree with 10 million elements:
l = go 10000000 E
where
go 0 t = insert 0 t
go n t = insert t $ go (n - 1) t
When I try to evaluate the left most element (leaf) of the tree like this:
left :: Tree a -> Maybe a
left E = Nothing
left (S x) = Just x
left (C _ _ l _) = left l
I encounter the following:
left l
*** Exception: stack overflow
Is this owing to the way that I am constructing the tree (non tail recursive) or is there some missing space leak that I cannot see.
Please note the function works fine for a million elements. Additionally I attempted a tail recursive way of the tree construction:
l = go 10000000 E
where
go 0 t = insert 0 t
go n t = go (n - 1) (insert n t)
but encountered the same stack overflow exception.
[1] https://www.cs.tufts.edu/~nr/cs257/archive/chris-okasaki/redblack99.pdf
EDIT
The insert and balance function for completeness:
insert :: Ord a => a -> Tree a -> Tree a
insert x xs = makeBlack $ ins xs
where
ins E = S x
ins (S a) = C R (S x) (S a)
ins (C c l r) = balance c (ins l) r -- always traverse left and trust the balancing
makeBlack (C _ l r) = C B l r
makeBlack a = a
balance :: Color -> Tree a -> Tree a -> Tree a
balance B (C R (C R a b) c) d = C R (C B a b) (C B c d)
balance B (C R a (C R b c)) d = C R (C B a b) (C B c d)
balance B a (C R (C R b c) d) = C R (C B a b) (C B c d)
balance B a (C R b (C R c d)) = C R (C B a b) (C B c d)
balance color a b = C color a b
There was mistyping from my end while typing in the insert code, it is insert n $ go (n - 1) t and not insert t $ go (n - 1) t. However when actually encountering the stack overflow the code was correct and the overflow happened in ghci.

The first example of insertion code has a bug: it tries to insert the tree itself as an element.
The second version
l = go 10000000 L.empty where
go 0 t = L.cons 0 t
go n t = go (n - 1) (L.cons n t)
Is indeed tail recursive, but it still has a problem: it doesn't at any step "force" the tree while it is being constructed. Due to Haskell's laziness, go will return a thunk that hides 10000000 pending applications of L.cons.
When the runtime tries to "pop" that thunk, it will put each n variable in the stack while the thunk below is being "popped" in its turn, causing the stack overflow. "Function calls don't add stack frames in Haskell; instead, stack frames come from nesting thunks."
The solution is to force each intermediate tree to WHNF, so that thunks don't accumulate. This should be enough (using the BangPatterns extension):
l :: Tree Int
l = go 10000000 L.empty
where
go 0 !t = L.cons 0 t
go n !t = go (n - 1) (L.cons n t)
This basically means: "before recursing to add another element, make sure the accumulator is in WHNF". The n need not be forced because it is scrutinized in the pattern-match.

Related

Update values in Haskell?

I'm trying to write a function that constantly takes in integer values and outputs the maximum value given so far.
This seems as easy as a simple
max a b = maximum [a,b]
But the function should "remember" what values were given to it in the past, and compare new values with those as well.
For example: The initial value I want to start with is -500. If I call my function with an argument of 5, then it should return5, because 5 is greater than -500.
If I then call the function again, this time with an argument of 3, it should still returns 5, because even though 3 is greater than -500, it is less than 5.
Is this even possible with side-effect free programming?

Here's a direct translation of the alpha-beta pseudocode. I haven't tested it, but I hope it gets the general method of translating the algorithm across, even if it's not entirely correct.
alphaBeta ::
(state -> Int) -- heuristic
-> (state -> Bool -> [state]) -- next states
-> state
-> Int -- depth
-> Int -- alpha
-> Int -- beta
-> Bool -- is max player
-> Int -- score
alphaBeta heu children = go where
go s d a b p | d == 0 || null ss = heu s
| otherwise = score
where
ss = children s p
score = fst $ head $ case p of
True ->
takeWhile ((<b) . snd) $ scanl step (minBound, a) ss where
step (v, a) s = (v', max a v') where
v' = max v (go s (d - 1) a b (not p))
False ->
takeWhile ((a<) . snd) $ scanl step (maxBound, b) ss where
step (v, b) s = (v', min b v') where
v' = min v (go s (d - 1) a b (not p))
We use scanl to process each child state of the current state while remembering partial results. We use takeWhile to do the cut. Since lists are lazy, we don't process children after the cut. Also, list fusion or garbage collection (depending on optimization levels and GHC version) ensures that scanl uses constant space here. This is a fine example of laziness allowing us to define algorithms compositionally.
Alternatively, we could do the cutting and processing simultaneously, thereby sticking closer to the original presentation. foldM with Either lets us process the children one by one and decide at each step whether to stop and return a value with Left or continue with Right. So we use Left as an analogue of the break statement in the imperative pseudocode.
import Control.Monad
alphaBeta ::
(state -> Int) -- heuristic
-> (state -> Bool -> [state]) -- next states
-> state
-> Int -- depth
-> Int -- alpha
-> Int -- beta
-> Bool -- is max player
-> Int -- score
alphaBeta heu children = go where
go s d a b p | d == 0 || null ss = heu s
| otherwise = score
where
ss = children s p
score = either id fst $ case p of
True ->
foldM step (minBound, a) ss where
step (v, a) s | b <= a' = Left v'
| otherwise = Right (v', a')
where v' = max v (go s (d - 1) a b (not p))
a' = max a v'
False ->
foldM step (maxBound, b) ss where
step (v, b) s | b' <= a = Left v'
| otherwise = Right (v', b')
where v' = min v (go s (d - 1) a b (not p))
b' = min b v'

Haskell - Problems in the Mandelbrot drawer [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
So, as a beginner, I thought I would work on a horribly, awefully terrible version of the mandelbrot set project. In this pitiful case, the set is drawn with text (Shreik!) into a text file. Because I wanted some practice with some numerical coding, I have designed the worst complex number system in existance. I can't spot the problem in the code - the one that draws a band instead of a mandelbrot set. Here it is (don't look at it too long or you could die from over expossure to noob-ioactivity):
-- complex numbers, test for mandelbrot set
----------------- Complex Numbers
data C = Complex Float Float -- a + bi
deriving Show
data Mandelbrot = Possible -- if thought to be in mandelbrot set
| Not Integer -- this Integer is iterations before |z| > 2 in z=z^2+c
deriving Show
complexReal :: C -> Float
complexReal (Complex n _) = n
complexImaginary :: C -> Float
complexImaginary (Complex _ n) = n
modulus :: C -> Float
modulus (Complex n m) = sqrt ((n^2) + (m^2))
argument :: C -> Float --returns in radians
argument (Complex m n) | n < 0 && m < 0 = pi + (argument (Complex (0-m) (0-n)))
| m < 0 = (pi / 2) + (argument (Complex (0-m) n))
| n < 0 = ((3 * pi) / 2) + (argument (Complex m (0-n)))
| otherwise = atan (n / m)
multComplex :: C -> C -> C
multComplex (Complex m n) (Complex x y) = Complex ((m*x)-(n*y)) ((m*y)+(n*x))
addComplex :: C -> C -> C
addComplex (Complex m n) (Complex x y) = Complex (m + x) (m + y)
----------------- End Complex numbers
----------------- Mandelbrot
inMandelbrot :: C -> Mandelbrot
inMandelbrot c = inMandelbrotTest (Complex 0 0) c 0
--(z, c, i terations) with z=z^2+c, c is plotted on set map if z is bound
inMandelbrotTest :: C -> C -> Integer -> Mandelbrot
inMandelbrotTest z c i | (modulus z) > 2 = Not i -- too large
| i > 100 = Possible -- upper limit iterations
| otherwise = inMandelbrotTest (addComplex (multComplex z z) c) c (i+1)
possiblyInMandelbrot :: Mandelbrot -> Bool
possiblyInMandelbrot Possible = True
possiblyInMandelbrot _ = False
mandelbrotLine :: [C] -> String
mandelbrotLine [] = "\n"
mandelbrotLine (n:x) | possiblyInMandelbrot (inMandelbrot n) = "#" ++ mandelbrotLine x
mandelbrotLine (_:x) = " " ++ mandelbrotLine x
mandelbrotFeild :: [[C]] -> String
mandelbrotFeild [[]] = ""
mandelbrotFeild (n:x) = (mandelbrotLine n) ++ (mandelbrotFeild x)
-----------------End Mandelbrot
---------------- textual output
feildLine :: Float -> Float -> Float -> Float -> [C] -- start R, end R, i, increment x
feildLine s e i x | s > e = []
| otherwise = [(Complex s i)] ++ feildLine (s+x) e i x
feildGenerate :: Float -> Float -> Float -> Float -> Float -> [[C]] -- start R, end R, start i, end i, increment x
feildGenerate sr er si ei x | si > ei = [[]]
| otherwise = [(feildLine sr er si x)] ++ (feildGenerate sr er (si+x) ei x)
l1 :: String
l1 = mandelbrotFeild (feildGenerate (-3) 3 (-3) 3 0.05)
---------------- End textual output
main = do
writeFile "./mandelbrot.txt" (l1)
As you can see (or can't if you didn't look) there are some unused functions for my Complex numbers. Is there hope doctor?
Summary:
Why does this draw a band instead of the mandelbrot set?

Found your bug:
addComplex :: C -> C -> C
addComplex (Complex m n) (Complex x y) = Complex (m + x) (m + y)
It's really that simple. You have a trivial typo.
Some other suggestions:
Use Double rather than Float. For this example, it seems to give visibly more accurate results.
[x] ++ y is the same thing as x : y.
It is traditional to write x : xs rather than x : y. This makes it clear that one is a list element, the other is a list.
You can import Data.Complex to get complex-number arithmetic - but of course, you are writing this code for the purpose of learning, so that's fine.
If you define instance Num C where..., then you would be able to write z*z + c rather than addComplex (mulComplex z z) c. It's prettier to read - if you know how to write instances yet...
You can't spell "field". ;-)

Please correct my use of the Maybe Monad

I'm implementing a little program that does exponentiation ciphers. Some of the computations might fail, for instance, computing a modular inverse. I've used Maybe to deal with these sorts of failures. But now I'm stuck, as I need to "inject" the value inside of a maybe into another partially applied function. I know that if I had a function that took one argument, I'd use bind to do this.
import Data.Char
import Math.NumberTheory.Powers
extendedGcd::Integer->Integer->(Integer, Integer)
extendedGcd a b | r == 0 = (0, 1)
| otherwise = (y, x - (y * d))
where
(d, r) = a `divMod` b
(x, y) = extendedGcd b r
modularInverse::Integer->Integer->Maybe Integer
modularInverse n b | relativelyPrime n b = Just . fst $ extGcd n b
| otherwise = Nothing
where
extGcd = extendedGcd
relativelyPrime::Integer->Integer->Bool
relativelyPrime m n | gcd m n == 1 = True
| otherwise = False
textToDigits::String->[Integer]
textToDigits p = map (\x->toInteger (ord x - 97)) p
digitsToText::[Integer]->String
digitsToText d = map (\x->chr ((fromIntegral x) + 97)) d
exptEncipher::Integer->Integer->Integer->Maybe Integer
exptEncipher m k p | relativelyPrime k (p - 1) = Just $ powerMod p k m
| otherwise = Nothing
exptDecipher::Integer->Integer->Integer->Integer
exptDecipher m q c = powerMod c q m
exptEncipherString::Integer->Integer->String->[Maybe Integer]
exptEncipherString m k p = map (exptEncipher m k) plaintext
where
plaintext = textToDigits p
exptDecipherString::Integer->Integer->[Maybe Integer]->Maybe String
exptDecipherString m k c = (fmap digitsToText) plaintext
where
q = modularInverse k (m - 1)
plaintext = map (fmap $ exptDecipher m q) c
Specifically, my problem is in the function exptDecipherString, where I needed to inject the value encapsulated by the monad in q into the function exptDecipher, which I will then lift to work on c. What's the right way to do this? Also, I'm worried that I'll end up with a list of [Maybe Char] instead of the Maybe String that I want. I'm having problems reasoning through all of this. Can someone enlighten me?

You can use sequence and ap to get the types to work out. First for their signatures:
ap :: Monad m => m (a -> b) -> m a -> m b
sequence :: Monad m => [m a] -> m [a]
Notice that sequence directly addresses your worry about having a [Maybe Char] instead of a Maybe String. Both are in Control.Monad (note that you'll have to import ap). We can use them as follows:
exptDecipherString :: Integer -> Integer -> [Maybe Integer] -> Maybe String
exptDecipherString m k c = fmap digitsToText plaintext
where
q = modularInverse k (m - 1)
plaintext = sequence $ map (ap $ fmap (exptDecipher m) q) c
We can get to this point by working through the types. First we appy exptDecipher to m, which gives us a function of type Integer -> Integer -> Integer. We want to apply this to q, but it's a Maybe Integer, so we have to use fmap (exptDecipher m) q, which then has type Maybe (Integer -> Integer). We can then pop ap on the front and get something of type Maybe Integer -> Maybe Integer. We then map this over c, which gives us a [Maybe Integer], which we can turn inside out using sequence.
This might not work—if there are bugs in the logic, etc.—but at least it compiles.
A couple of side notes: you can use the infix operators <$> and <*> from Control.Applicative in place of fmap and ap, respectively, for slightly nicer syntax, and your relativelyPrime can be written much more simply as relativelyPrime m n = gcd m n == 1.

How can I check if a BST is valid?

How can I check if a BST is a valid one, given its definition and using a generalized version of fold for BST?
data(Ord a, Show a, Read a) => BST a = Void | Node {
val :: a,
left, right :: BST a
} deriving (Eq, Ord, Read, Show)
fold :: (Read a, Show a, Ord a) => (a -> b -> b -> b) -> b -> BST a -> b
fold _ z Void = z
fold f z (Node x l r) = f x (fold f z l) (fold f z r)
The idea is to check that a node value is greater then all values in left-subtree and smaller than all values in its right-subtree. This must be True for all nodes in the tree. A function bstList simply output the list of (ordered) values in the BST.
Of course something like this won't work:
--isBST :: (Read a, Show a, Ord a) => BST a -> Bool
isBST t = fold (\x l r -> all (<x) (bstList l) && all (>x) (bstList r)) (True) t
because, for example, applying the fold function to the node 19 ends up all (<19) (bstList True) && all (>19) (bstList True).

Your problem seems to be that you lose information because your function only returns a boolean when it examines the left and right subtrees. So change it to also return the minimum and maximum values of the subtrees. (This is probably more efficient as well, since you don't need to used bslist to check all elements anymore)
And make a wrapper function to ignore these "auxiliary" values after you are done, of course.

(Please don't put typeclass constraints on the data type.)
A BST is valid iff an in-order traversal is monotonically increasing.
flatten tree = fold (\a l r -> l . (a:) . r) id tree []
ordered list#(_:rest) = and $ zipWith (<) list rest
ordered _ = True
isBST = ordered . flatten

A nice way of encoding this is to lean on the traversal provided by Data.Foldable.
{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}
import Data.Foldable
import Data.Monoid
We can derive an instance of it automatically using an extension, but we need to reorder the fields of the Node constructor to provide us an in-order traversal.
While we're at it, we should eliminate the constraints on the data type itself. They actually provide no benefit, and has been removed from the language as of Haskell 2011. (When you want to use such constraints you should put them on instances of classes, not on the data type.)
data BST a
= Void
| Node
{ left :: BST a
, val :: a
, right :: BST a
} deriving (Eq, Ord, Read, Show, Foldable)
First we define what it means for a list to be strictly sorted.
sorted :: Ord a => [a] -> Bool
sorted [] = True
sorted [x] = True
sorted (x:xs) = x < head xs && sorted xs
-- head is safe because of the preceeding match.
Then we can use the toList method provided by Data.Foldable and the above helper.
isBST :: Ord a => BST a -> Bool
isBST = sorted . toList
We can also implement this more directly, like you asked. Since we removed the spurious constraints on the data type, we can simplify the definition of your fold.
cata :: (b -> a -> b -> b) -> b -> BST a -> b
cata _ z Void = z
cata f z (Node l x r) = f (cata f z l) x (cata f z r)
Now we need a data type to model the result of our catamorphism, which is that we either have no nodes (Z), or a range of strictly increasing nodes (T) or have failed (X)
data T a = Z | T a a | X deriving Eq
And we can then implement isBST directly
isBST' :: Ord a => BST a -> Bool
isBST' b = cata phi Z b /= X where
phi X _ _ = X
phi _ _ X = X
phi Z a Z = T a a
phi Z a (T b c) = if a < b then T a c else X
phi (T a b) c Z = if b < c then T a c else X
phi (T a b) c (T d e) = if b < c && c < d then T a e else X
This is a bit tedious, so perhaps it would be better to decompose the way we compose the interim states a bit:
cons :: Ord a => a -> T a -> T a
cons _ X = X
cons a Z = T a a
cons a (T b c) = if a < b then T a c else X
instance Ord a => Monoid (T a) where
mempty = Z
Z `mappend` a = a
a `mappend` Z = a
X `mappend` _ = X
_ `mappend` X = X
T a b `mappend` T c d = if b < c then T a d else X
isBST'' :: Ord a => BST a -> Bool
isBST'' b = cata phi Z b /= X where
phi l a r = l `mappend` cons a r
Personally, I'd probably just use the Foldable instance.

If you don't insist on using a fold you can do it like this:
ord Void = True
ord (Node v l r) = every (< v) l && every (> v) r && ord l && ord r where
every p Void = True
every p (Node v l r) = p v && every p l && every p r

What Self Balancing Tree is simplest in Functional Programming?

I'm designing a self balancing tree in Haskell. As an exercise and because it is nice to have in your back hand.
Previously in C and Python I preferred Treaps and Splay Trees due to their simple balancing rules. I always disliked R/B Trees, since they seemed like more work than they were worth.
Now, due to the functional nature of Haskell, things seem to have changed. I can write a R/B insert function in 10 lines of code. Treaps on the other hand requires wrapping to store the random number generator, and Splay Trees are a pain to do top-down.
So I'm asking if you have experience with other types of trees?
Which ones are better at utilizing the pattern matching and top-down nature of functional languages?

Ok, I guess there wasn't a lot of references or research for answering this question. Instead I've taken the time to try your different ideas and trees. I didn't find anything a lot better than RB trees, but perhaps that's just search bias.
The RB tree can be (insertion) balanced with four simple rules, as shown by Chris Okasaki:
balance T (T R (T R a x b) y c) z d = T R (T B a x b) y (T B c z d)
balance T (T R a x (T R b y c)) z d = T R (T B a x b) y (T B c z d)
balance T a x (T R b y (T R c z d)) = T R (T B a x b) y (T B c z d)
balance T a x (T R (T R b y c) z d) = T R (T B a x b) y (T B c z d)
balance T a x b = T B a x b
AVL trees can be balanced in a similar pattern matching way. However the rules don't compress as well:
balance T (T (T a x b dx) y c (-1)) z d (-2) = T (T a x b dx) y (T c z d 0) 0
balance T a x (T b y (T c z d dz) 1 ) 2 = T (T a x b 0) y (T c z d dz) 0
balance T (T a x (T b y c 1 ) 1 ) z d (-2) = T (T a x b -1) y (T c z d 0) 0
balance T (T a x (T b y c (-1)) 1 ) z d (-2) = T (T a x b 0) y (T c z d 1) 0
balance T (T a x (T b y c _ ) 1 ) z d (-2) = T (T a x b 0) y (T c z d 0) 0
balance T a x (T (T b y c 1 ) z d (-1)) 2 = T (T a x b -1) y (T c z d 0) 0
balance T a x (T (T b y c (-1)) z d (-1)) 2 = T (T a x b 0) y (T c z d 1) 0
balance T a x (T (T b y c _ ) z d (-1)) 2 = T (T a x b 0) y (T c z d 0) 0
balance t = t
As AVL trees seams to generally be considered inferior to RB trees, they are probably not worth the extra hassle.
AA trees could theoretically be balanced nice and easily by:
balance T n (T n a x b) y c = T n a x (T n b y c) -- skew
balance T n a x (T n b y (T n c z d)) = T (n+1) (T n a x b) y (T n c z d) --split
balance T n a x b = T n a x b
But unfortunately Haskell don't like the overloading of n. It is possible that a less standard implementation of AA trees, not using ranks, but something more similar to R and B, would work well.
Splay trees are difficult because you need to focus on a single node, rather than the static structure of the tree. It can be done by merging insert and splay.
Treaps are also uneasy to do in a functional environment, as you don't have a global random generator, but need to keep instances in every node. This can be tackled by leaving the task of generating priorities to the client, but even then, you can't do priority comparison using pattern matching.

As you say Red Black trees aren't that hard to use. Have you given finger trees a look? You might be interested in augmenting your base data structure with something like a zipper. Another tree you might find interesting is the AA tree it is a simplification of Red Black Trees.

It's the one that's already implemented.
There are fine implementations in Haskell of balanced trees such as Data.Map and Data.Set. Don't they fulfill your needs? Don't reimplement, reuse.

The OCaml standard library uses an AVL tree for its map functor. It seems as though it's easier to implement than an RB-tree if you include a remove operation.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Stack overflow when constructing/evaluating a red black tree in Haskell - haskell

Related

Update values in Haskell?

Haskell - Problems in the Mandelbrot drawer [closed]

Please correct my use of the Maybe Monad

How can I check if a BST is valid?

What Self Balancing Tree is simplest in Functional Programming?

Categories

Resources