Simple search tree in Haskell: why stack overflow? - haskell

I am completely new to Haskell and trying to learn. I decided to write a short (unbalanced) binary search tree code just to get going. It breaks a text into words, adds the words to the binary tree (discarding repetitions), and then traverses the tree in order to print out the sorted list of words in the text.
data BinTree t = ExternalNode
| InternalNode (BinTree t) t (BinTree t)
treeInsert :: Ord t => BinTree t -> t -> BinTree t
treeInsert ExternalNode w = InternalNode ExternalNode w ExternalNode
treeInsert tree#(InternalNode left v right) w
| w == v = tree
| w < v = InternalNode (treeInsert left w) v right
| otherwise = InternalNode left v (treeInsert right w)
treeFromList :: Ord t => [t] -> BinTree t
treeFromList l = go ExternalNode l
where
go acc [] = acc
go acc (x : xs) = acc `seq` go (treeInsert acc x) xs
inOrderList :: BinTree t -> [t]
inOrderList ExternalNode = []
inOrderList (InternalNode left v right) = (inOrderList left) ++ [ v ] ++ (inOrderList right)
main :: IO ()
main = do
tmp <- readFile "words.txt"
printList . inOrderList . treeFromList $ words tmp
where
printList [] = return ()
printList (x : xs) = do
putStrLn x
printList xs
The program works fine on small texts. Then I fed the King James Bible to it. It crashes complaining that the stack size is too small. I have to increase the stack size to 200M to make it work!
Where is my mistake? I imagine it could have something to do with lazy evaluation messing up stuff. In any case, the problem is not with the depth of the binary search tree, which is only 163 for the Bible example.

The problem is that you are building up too deeply nested thunks.
This version adds seq calls in treeInsert to force evaluation at each level of the tree
and can run in very little stack:
import System.Environment
import Control.Monad
data BinTree t = ExternalNode
| InternalNode (BinTree t) !t (BinTree t)
treeInsert :: Ord t => BinTree t -> t -> BinTree t
treeInsert ExternalNode w = InternalNode ExternalNode w ExternalNode
treeInsert tree#(InternalNode left v right) w
| w == v = tree
| w < v = let t = treeInsert left w in t `seq` InternalNode t v right
| otherwise = let t = treeInsert right w in t `seq` InternalNode left v t
treeFromList :: Ord t => [t] -> BinTree t
treeFromList l = go ExternalNode l
where
go acc [] = acc
go acc (x : xs) = let t = treeInsert acc x in t `seq` go t xs
inOrderList :: BinTree t -> [t]
inOrderList ExternalNode = []
inOrderList (InternalNode left v right) = (inOrderList left) ++ [ v ] ++ (inOrderList right)
main1 = do
(arg0:_) <- getArgs
tmp <- readFile arg0
let t = treeFromList $ words tmp
forM_ (inOrderList t) putStrLn
main = main1
You can also use strictness annotations in the definition of BinTree:
data BinTree t = ExternalNode | InternalNode !(BinTree t) !t !(BinTree t)
in lieu of the seq calls in treeInsert - this is what Data.Set does.
It appears that the seq call in treeFromList doesn't have much effect.

Related

How do you represent nested types using the Scott Encoding?

An ADT can be represented using the Scott Encoding by replacing products by tuples and sums by matchers. For example:
data List a = Cons a (List a) | Nil
Can be encoded using the Scott Encoding as:
cons = (λ h t c n . c h t)
nil = (λ c n . n)
But I couldn't find how nested types can be encoded using SE:
data Tree a = Node (List (Tree a)) | Leaf a
How can it be done?
If the Wikipedia article is correct, then
data Tree a = Node (List (Tree a)) | Leaf a
has Scott encoding
node = λ a . λ node leaf . node a
leaf = λ a . λ node leaf . leaf a
It looks like the Scott encoding is indifferent to (nested) types. All it's concerned with is delivering the correct number of parameters to the constructors.
Scott encodings are basically representing a T by the type of its case expression. So for lists, we would define a case expression like so:
listCase :: List a -> r -> (a -> List a -> r) -> r
listCase [] n c = n
listCase (x:xs) n c = c x xs
this gives us an analogy like so:
case xs of { [] -> n ; (x:xs) -> c }
=
listCase xs n (\x xs -> c)
This gives a type
newtype List a = List { listCase :: r -> (a -> List a -> r) -> r }
The constructors are just the values that pick the appropriate branches:
nil :: List a
nil = List $ \n c -> n
cons :: a -> List a -> List a
cons x xs = List $ \n c -> c x xs
We can work backwards then, from a boring case expression, to the case function, to the type, for your trees:
case t of { Leaf x -> l ; Node xs -> n }
which should be roughly like
treeCase t (\x -> l) (\xs -> n)
So we get
treeCase :: Tree a -> (a -> r) -> (List (Tree a) -> r) -> r
treeCase (Leaf x) l n = l x
treeCase (Node xs) l n = n xs
newtype Tree a = Tree { treeCase :: (a -> r) -> (List (Tree a) -> r) -> r }
leaf :: a -> Tree a
leaf x = Tree $ \l n -> l x
node :: List (Tree a) -> Tree a
node xs = Tree $ \l n -> n xs
Scott encodings are very easy tho, because they're only case. Church encodings are folds, which are notoriously hard for nested types.

traversal on tree in Haskell with print

I have following toy haskell code for binary search tree. The function preorderV is to traverse the tree with a function applying to every element. It works fine for normal function. However if I apply function "print", I got compiling error. How could I make function related to IO work with preorderV?
Thanks
Code:
data BSTree a = EmptyTree | Node a (BSTree a) (BSTree a) deriving (Show)
data Mode = IN | POST | PRE
singleNode :: a -> BSTree a
singleNode x = Node x EmptyTree EmptyTree
bstInsert :: (Ord a) => a -> BSTree a -> BSTree a
bstInsert x EmptyTree = singleNode x
bstInsert x (Node a left right)
| x == a = Node a left right
| x < a = Node a (bstInsert x left) right
| x > a = Node a left (bstInsert x right)
buildTree :: String -> BSTree String
buildTree = foldr bstInsert EmptyTree . words
preorderV :: (a->b) -> BSTree a -> BSTree b
preorderV f EmptyTree = EmptyTree
preorderV f (Node x left right) = Node (f x) (preorderV f left) (preorderV f right)
Error:
Couldn't match type ‘BSTree’ with ‘IO’
Expected type: IO (IO ())
Actual type: BSTree (IO ())
In a stmt of a 'do' block: preorderV print $ buildTree content
The type of preorderV print $ buildTree content is BSTree (IO ()), that is, you are creating a binary tree of IO computations - you don't have an IO computation itself.
To do what I think you want to do you need to create a monadic version of preorderV which has the following type:
preorderVM :: (a -> IO ()) -> BSTree a -> IO ()
preorderVM f EmptyTree = ... -- what do you want to do here?
preorderVM f (Node x left right) = do
f x
preorderVM f left
preorderVM f right
your code works so far but you got a BSTree (IO ()) back and as you use this inside a do Block (main most likely) you get the error.
what you now can do is fold over the tree to get the actions back, then you can use sequence or something similar to use each of those actions one after the other:
foldT :: (a -> s -> s) -> s -> BSTree a -> s
foldT _ s EmptyTree = s
foldT f s (Node a left right) =
let s' = foldT f s left
s'' = f a s'
in foldT f s'' right
main :: IO ()
main = do
let doPrint = preorderV print $ buildTree "C A B D E"
folded = foldT (:) [] doPrint
sequence_ . reverse $ folded
This will print
λ> :main
"A"
"B"
"C"
"D"
"E"
BTW: Your preorderV is usually called map ;)

What benefits do I get from creating an instance of Comonad

In my application, I'm trying to implement an animation system. In this system, animations are represented as a cyclic list of frames:
data CyclicList a = CL a [a]
We can (inefficiently) advance the animation as follows:
advance :: CyclicList a -> CyclicList a
advance (CL x []) = CL x []
advance (CL x (z:zs)) = CL z (zs ++ [x])
Now, I'm pretty sure that this data type is a comonad:
instance Functor CyclicList where
fmap f (CL x xs) = CL (f x) (map f xs)
cyclicFromList :: [a] -> CyclicList a
cyclicFromList [] = error "Cyclic list must have one element!"
cyclicFromList (x:xs) = CL x xs
cyclicLength :: CyclicList a -> Int
cyclicLength (CL _ xs) = length xs + 1
listCycles :: CyclicList a -> [CyclicList a]
listCycles cl = let
helper 0 _ = []
helper n cl' = cl' : (helper (n-1) $ advance cl')
in helper (cyclicLength cl) cl
instance Comonad CyclicList where
extract (CL x _) = x
duplicate = cyclicFromList . listCycles
The question I have is: what kind of benefits do I get (if any) from using the comonad instance?
The advantage of providing a type class or implementing an interface is that code, written to use that typeclass or interface, can use your code without any modifications.
What programs can be written in terms of Comonad? A Comonad provides a way to both inspect the value at the current location (without observing its neighbors) using extract and a way to observe the neighborhood of every location with duplicate or extend. Without any additional functions, this isn't terribly useful. However, if we also require other functions along with the Comonad instance, we can write programs that depend on both local data and data from elsewhere. For example, if we require functions that allow us to change location, such as your advance, we can write programs that depend only on the local structure of the data, not on the data structure itself.
For a concrete example, consider a cellular automata program written in terms of Comonad and the following Bidirectional class:
class Bidirectional c where
forward :: c a -> Maybe (c a)
backward :: c a -> Maybe (c a)
The program could use this, together with Comonad, to extract data stored in a cell and explore the cells forward and backward of the current cell. It can use duplicate to capture the neighborhood of each cell and fmap to inspect that neighborhood. This combination of fmap f . duplicate is extract f.
Here is such a program. rule' is only interesting to the example; it implements cellular automata rules on neighborhood with just the left and right values. rule extracts data from the neighborhood, given the class, and runs the rule on each neighborhood. slice pulls out even larger neighborhoods so that we can display them easily. simulate runs the simulation, displaying these larger neighborhoods for each generation.
rule' :: Word8 -> Bool -> Bool -> Bool -> Bool
rule' x l m r = testBit x ((if l then 4 else 0) .|. (if m then 2 else 0) .|. (if r then 1 else 0))
rule :: (Comonad w, Bidirectional w) => Word8 -> w Bool -> w Bool
rule x = extend go
where
go w = rule' x (maybe False extract . backward $ w) (extract w) (maybe False extract . forward $ w)
slice :: (Comonad w, Bidirectional w) => Int -> Int -> a -> w a -> [a]
slice l r a w = sliceL l w (extract w : sliceR r w)
where
sliceR r w | r > 0 = case (forward w) of
Nothing -> take r (repeat a)
Just w' -> extract w' : sliceR (r-1) w'
sliceR _ _ = []
sliceL l w r | l > 0 = case (backward w) of
Nothing -> take l (repeat a) ++ r
Just w' -> sliceL (l-1) w' (extract w':r)
sliceL _ _ r = r
simulate :: (Comonad w, Bidirectional w) => (w Bool -> w Bool) -> Int -> Int -> Int -> w Bool -> IO ()
simulate f l r x w = mapM_ putStrLn . map (map (\x -> if x then '1' else '0') . slice l r False) . take x . iterate f $ w
This program might have been intended to work with the following Bidirectional Comonad, a Zipper on a list.
data Zipper a = Zipper {
heads :: [a],
here :: a,
tail :: [a]
} deriving Functor
instance Bidirectional Zipper where
forward (Zipper _ _ [] ) = Nothing
forward (Zipper l h (r:rs)) = Just $ Zipper (h:l) r rs
backward (Zipper [] _ _) = Nothing
backward (Zipper (l:ls) h r) = Just $ Zipper ls l (h:r)
instance Comonad Zipper where
extract = here
duplicate (Zipper l h r) = Zipper (goL (h:r) l) (Zipper l h r) (goR (h:l) r)
where
goL r [] = []
goL r (h:l) = Zipper l h r : goL (h:r) l
goR l [] = []
goR l (h:r) = Zipper l h r : goR (h:l) r
But will also work with a CyclicList Bidirectional Comonad.
data CyclicList a = CL a (Seq a)
deriving (Show, Eq, Functor)
instance Bidirectional CyclicList where
forward (CL x xs) = Just $ case viewl xs of
EmptyL -> CL x xs
x' :< xs' -> CL x' (xs' |> x)
backward (CL x xs) = Just $ case viewr xs of
EmptyR -> CL x xs
xs' :> x' -> CL x' (x <| xs')
instance Comonad CyclicList where
extract (CL x _) = x
duplicate (CL x xs) = CL (CL x xs) (go (singleton x) xs)
where
go old new = case viewl new of
EmptyL -> empty
x' :< xs' -> CL x' (xs' >< old) <| go (old |> x') xs'
We can reuse simulate with either data structure. The CyclicList has a more interesting output, because, instead of bumping into a wall, it wraps back around to interact with itself.
{-# LANGUAGE DeriveFunctor #-}
import Control.Comonad
import Data.Sequence hiding (take)
import Data.Bits
import Data.Word
main = do
putStrLn "10 + 1 + 10 Zipper"
simulate (rule 110) 10 10 30 $ Zipper (take 10 . repeat $ False) True (take 10 . repeat $ False)
putStrLn "10 + 1 + 10 Cyclic"
simulate (rule 110) 10 10 30 $ CL True (fromList (take 20 . repeat $ False))

Trying to implement path record for Haskell binary tree search

I've been playing around with binary trees in Haskell, and I am attempting to implement a dfs variant that returns the path (composed of left and rights) from the root node to the node containing the value searched for. I think it would be best to return a Maybe Directions type.
Here is what has been implemented so far.
data Tree a = Empty | Node a (Tree a) (Tree a)
deriving (Show, Eq)
data Direction = L | R
deriving (Show)
type Directions = [Direction]
inTree :: (Eq a) => a -> Tree a -> [Direction]
inTree val (Node x l r)
| val == x = []
| l /= Empty = L:(inTree val l)
| r /= Empty = R:(inTree val r)
| otherwise =
but I have no idea how to have it traverse the entire tree. I feel as though I might be thinking too imperatively.
Your idea to use Maybe Direction is good. I would rewrite your function as follows:
inTree :: (Eq a) => a -> Tree a -> Maybe [Direction]
inTree val Empty = Nothing
inTree val (Node x l r)
| val == x = Just []
| otherwise = (fmap (L:) (inTree val l)) <|> (fmap (R:) (inTree val r))
fmaping a function f on a Maybe results in Nothing if the original value is Nothing and Just (f v) if it's Just v. In our case if the recursive call found the value (so it's returning a path Just [Direction]) we append the Direction we took at the current node.
The <|> operator comes from the Alternative instance of Maybe. It's a left biased choice on maybes. Here we use it to pick the subtree which returned Just [Direction] (if there was any).
A good exercise is to modify the code so that it returns all the paths to the xs in the tree.
Here is a less streamlined version similar to the style presented in the question. This might be useful for anyone on a more basic level of learning Haskell and not yet understanding the contents of the Control.Applicative library.
inTree :: Eq a => a -> Tree a -> Maybe [Direction]
inTree _ Empty = Nothing
inTree val (Node x l r)
| val == x = Just []
| otherwise = case inTree val l of Just ys -> Just (L : ys)
Nothing -> case inTree val r of Just ys -> Just (R : ys)
Nothing -> Nothing

Haskell multiple bindings inside lambda

I am new to Haskell.
I have this code (my solution to one of the exercise from Ninety-Nine Haskell Problems)
data Structure a = Single a | Multiple (a, Int) deriving (Show)
encodeM ::(Eq a)=> [a]->[Structure a]
encodeM l = map(\x -> (let size = length x
--h = head x
in if size>1 then Multiple ( head x, size) else Single (head x)
)
) $ group l
When I uncomment "-h = head x" I get: "parse error on input `='"
But
xxx l= let size = length l
h = head l
in size
works fine, why it doesn't compile when I use "let" with multiple statement inside the lambda?
I have tried to replace let by where
encodeM2 ::(Eq a)=> [a]->[Structure a]
encodeM2 l = map(\x->if si>1 then Multiple ( head x, si) else Single (head x)
where si = length x)
but it doesn't compile as well, whats wrong with it?
This is your code properly indented: (note how the let bindings align vertically)
encodeM :: Eq a => [a] -> [Structure a]
encodeM l = map (\x -> let size = length x
h = head x in
if size > 1
then Multiple (h, size)
else Single h) $
group l
This is your code readable:
encodeM :: Eq a => [a] -> [Structure a]
encodeM = map runLength . group
where
runLength x =
let size = length x
h = head x in
if size > 1
then Multiple (h, size)
else Single h
This is your code idiomatic:
encodeM :: Eq a => [a] -> [Structure a]
encodeM = map runLength . group
where
runLength [x] = Single x
runLength xs = Multiple (head xs, length xs)
I prefer to use pattern matching to if/then/else, so your code becomes:
encodeM :: (Eq a) => [a] -> [Structure a]
encodeM lst = map fun $ group lst
where
fun [x] = Single x
fun l = Multiple (head l, length l)
In Haskell whitespace matters.
Align assignemnts in your let. And you can't use where in lambda.

Resources