Trying to implement a binary tree search - haskell

Im trying to implement a Binary Tree Search algorithm in haskell.
data BinTree k d =
Branch (BinTree k d) (BinTree k d) k d
| Leaf k d
| Empty
deriving (Eq, Show)
is the data structure im using to capture my binary tree. The problem is I dont know what to return if we cant find the value. This is what I have so far for my search function :
lkp :: (Monad m, Ord k) => BinTree k d -> k -> m d
lkp (Leaf a b) x
| a == x = return(b)
lkp (Branch lSub rSub a b) x
| a < x = lkp rSub x
| a > x = lkp lSub x
| a == x = return(b)
I can see in the tests that the test expects a return value of [] back yet I cannot understand how we return this empty value. This is an example of one of those tests :
testCase "3 in Leaf 5 500 ([]))[1 mark]"
(lkp (Leaf 5 500) 3 #?= [] )

So we miss a way to return a “zero” or “empty” value.
Fortunately, the Haskell base library (Prelude) offers the MonadPlus class. MonadPlus is a specialized version of Monad, which augments the regular Monad interface with mzero and mplus, providing essentially a monoid-like structure. Theory here on the Haskell Wiki.
Using MonadPlus, the code for lkp can be written as follows:
import Control.Monad
data BinTree k d =
Branch (BinTree k d) (BinTree k d) k d
| Leaf k d
| Empty
deriving (Eq, Show)
lkp :: (MonadPlus m, Ord k) => BinTree k d -> k -> m d
lkp (Branch lSub rSub a b) x
| a < x = lkp rSub x
| a > x = lkp lSub x
| otherwise = return b
lkp (Leaf a b) x = if (a == x) then (return b) else mzero
lkp Empty _ = mzero
Note:: I am using the otherwise keyword instead of the equality test in order to silence a spurious “non-exhaustive patterns” warning.
Testing under ghci:
λ>
λ> :load q65169028.hs
[1 of 1] Compiling Main ( q65169028.hs, interpreted )
Ok, one module loaded.
λ>
λ> tr1 = Branch (Leaf 1 2) (Branch (Leaf 5 6) (Branch (Leaf 9 10) (Leaf 17 18) 13 14) 7 8) 3 4
λ>
λ> (lkp tr1 7) :: [Int]
[8]
λ>
λ> (lkp tr1 8) :: [Int]
[]
λ>
λ> (lkp tr1 17) :: [Int]
[18]
λ>
We want to force the interpreter to choose lists as our MonadPlus instance, hence the :: [Int] type signature at the end of each line.
If that sounds too cumbersome, it is always possible the specialize the lkp function further, like this:
llkp :: Ord k => BinTree k d -> k -> [d]
llkp tr x = lkp tr x

Related

All possible binary trees storing a value

I want to write a function, allTrees, to generate a list of all possible binary trees that store the number of leaves each tree has.
Here are my data types and my attempt at the allTrees function:
data BTree = L | B BTree BTree
deriving (Eq, Ord, Show)
data SpecTree = S Integer BTree
deriving (Eq, Ord, Show)
leafNode :: SpecTree
leafNode = S 1 L
branch :: SpecTree -> SpecTree -> SpecTree
branch (S size1 sub1) (S size2 sub2) = S (size1 + size2) (B sub1 sub2)
allTrees :: [SpecTree]
allTrees = leafNode : branch allTrees allTrees
Expected output:
take 9 allTrees = [S 1 L,S 2 (B L L),S 3 (B L (B L L)),S 3 (B (B L L) L),S 4 (B L (B L (B L L))),S 4 (B L (B (B L L) L)),S 4 (B (B L L) (B L L)),S 4 (B (B L (B L L)) L),S 4 (B (B (B L L) L) L)]
Actual output:
take 9 allTrees = [S 1 L,S 2 (B L L),S 3 (B L (B L L)),S 4 (B L (B L (B L L))),S 5 (B L (B L (B L (B L L)))),S 6 (B L (B L (B L (B L (B L L))))),S 7 (B L (B L (B L (B L (B L (B L L)))))),S 8 (B L (B L (B L (B L (B L (B L (B L L))))))),S 9 (B L (B L (B L (B L (B L (B L (B L (B L L))))))))]
My output is close but not quite it. I think foldM may be useful here, but not sure how I can use it.
The problem is as Carl describes in the comments. You're generating infinitely many different trees, but because of the order you generate them in, you don't get them all. It's like generating "all integers" by starting at 1 and doubling each time. Every integer is new, and you never run out, but you miss most integers. In your case, you generate just the degenerate right-child-only trees, because that's the direction you bias your exploration, and you never run out of room to explore that direction.
Instead, as Carl also suggested in the comments, if you want to ensure you hit every possible tree, generate them in an order that ensures you don't miss any: smallest first.
allTreesOfSize :: Integer -> [BTree]
allTreesOfSize 0 = [] -- Not used in the recursive case
allTreesOfSize 1 = [L]
allTreesOfSize n = do
lSize <- [1..n-1]
let rSize = n - lSize
B <$> allTreesOfSize lSize <*> allTreesOfSize rSize
allTrees :: [SpecTree]
allTrees = do
n <- [1..]
S n <$> allTreesOfSize n
> take 5 allTrees
[ S 1 L
, S 2 (B L L)
, S 3 (B L (B L L))
, S 3 (B (B L L) L)
, S 4 (B L (B L (B L L)))
]
#amalloy's answer is elegant, but it does require you to think about a way to generate the data in your specific domain (namely, how to split the "size" between the subtrees).
In general, you might really want to apply a function to all pairs from two infinite lists. Here's a function that does it. I bet it can still be done more elegantly.
data BTree = L | B BTree BTree
deriving (Eq, Ord, Show)
data SpecTree = S Integer BTree
deriving (Eq, Ord, Show)
leafNode :: SpecTree
leafNode = S 1 L
branch :: SpecTree -> SpecTree -> SpecTree
branch (S size1 sub1) (S size2 sub2) = S (size1 + size2) (B sub1 sub2)
allTrees :: [SpecTree]
allTrees = leafNode : infApply branch allTrees allTrees
infApply :: (a -> b -> c) -> [a] -> [b] -> [c]
infApply f xs ys = map (uncurry f) (infProduct xs ys)
-- All possible pairs from two infinite lists.
infProduct xs ys = infterleave (infNested xs ys)
-- A nested (infinite) list of (infinite) lists of pairs from the two given (infinite) lists.
infNested xs ys = [[(x, y) | x <- xs] | y <- ys]
-- Interleave the elements of an infinite collection of infinite lists.
infterleave xss =
infterleave' xss 0 0
where
infterleave' :: [[a]] -> Int -> Int -> [a]
infterleave' xss n m =
let (result, remainder) = snoc (xss !! n)
newXss = take n xss ++ [remainder] ++ drop (n+1) xss
in
result : (infterleave' newXss (if n < m then n+1 else 0) (if n < m then m else m+1))
snoc (x:xs) = (x, xs)

Can uniplate's `universeBi` be used to retrieve nodes in a breadth-first fashion?

Is it possible to use Uniplate's universeBi to get the output in breadth-first-order? It appears the results are returned in a depth-first fashion. I'm wondering how I can use uniplate to retrieve the universeBi in a breadth-first fashion.
To illustrate, consider the following toy program:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data
import Data.Generics.Uniplate.Data
data A = A B Int deriving (Data, Typeable)
data B = B Int deriving (Data, Typeable)
val :: A
val = A (B 1) 2
ints :: [Int]
ints = universeBi val
I get:
*Main> ints
[1,2]
But this is depth-first, as 1 is obtained from the B node. I'd rather get it in the breadth-first order, i.e., receive [2,1]. Is this achievable in uniplate?
You can dig into the structure of the Str returned by biplate:
layers :: Str a -> [[a]]
layers Zero = []
layers (One x) = [[x]]
layers (Two f x) = catLayers (layers f) ([] : layers x)
where catLayers [] ys = ys
catLayers xs [] = xs
catLayers (x : xs) (y : ys) = (x ++ y) : catLayers xs ys
layersBi :: Biplate from to => from -> [[to]]
layersBi = layers . fst . biplate
breadthBi :: Biplate from to => from -> [to]
breadthBi = concat . layersBi
So now
breadthBi (A (B 1) 2) :: [Int]
-- = [2, 1]
and
data Tree a = Branch (Tree a) a (Tree a) | Leaf deriving (Data, Typeable)
-- 4
-- 2 6
-- 1 3 5 7
example = Branch (Branch (Branch Leaf 1 Leaf) 2 (Branch Leaf 3 Leaf)) 4 (Branch (Branch Leaf 5 Leaf) 6 (Branch Leaf 7 Leaf))
(layersBi :: Data a => Tree a -> [[a]]) example
-- = [[],[4],[2,6],[1,3,5,7]]
I'm not sure if it's actually guaranteed that Str exactly reflects the structure of the data type, but it appears to. You could instead cook something out of the Data primitives if you have to.

Sumation of tree

So for this problem I tried to take the sum of all leaves in a tree. But it's shooting an error every time. I am providing a snippet of the code I wrote.
Sample case
t1 =NODE 1 (NODE 2 (NODE 3 (LEAF 4) (LEAF 5)) (LEAF 6)) (NODE 7 (LEAF 8) (LEAF 9))
Answer should be 32.
data Tree a = LEAF a | NODE a (Tree a) (Tree a) deriving (Show, Read, Eq)
tre (LEAF a) = a
tre (NODE a (Tree b) (Tree c)) = [Tree b, Tree c]
sum [] accum = []
sum list#(x:xs) accum = if tre x == Int
then sumTree xs (accum + x)
else sumTree x accum
sumTree :: Num p => Tree p -> p
sumTree p accum= let
list = tre p
in sum list accum
32
The Haskell snipet provided is not the idiomatic Haskell way of solving the problem.
You don't need the tre function => use Pattern matching on constructors of your type
You don't have to use tre x == Int let out the magic of type inference
I've provided the following snippet of the code, load it into ghci and use :i Tree and :i sumTree to understand the types
module Main where
data Tree a = Leaf a | Node a (Tree a) (Tree a) deriving (Show)
sumTree (Leaf a) = a
sumTree (Node a l r) = a + sumTree l + sumTree r
main = do
let tree = Node 5 (Node 21 (Leaf 14) (Leaf 13)) (Leaf 29)
putStrLn $ show tree
putStrLn $ show $ sumTree tree

Map-like container with intervals as keys and zip-like combining operation

I'm looking for a Haskell container type like Data.Map that uses intervals as keys, where the left-most and right-most keys may also be unbounded intervals, but are otherwise non-overlapping. Additionally, the container should support a function similar to zipWith that allows to merge two containers into a new one, using the intersection of both key sets as the new key set and the argument function for a pointwise combination of both value sets.
There already are several packages that provide interval-based maps. I've had a look at IntervalMap, fingertree and SegmentTree, but none of these packages seem to provide the desired combination function. They all seem to use intervals for the intersection functions, that are equal in both maps, while I need a version that breaks intervals down into smaller ones if necessary.
The container should basically provide an efficient and storable mapping for key/value series of the form Ord k => k -> Maybe a, i.e. functions only defined on specific intervals or having larger intervals mapping to the same value.
Here is a small example to demonstrate the issue:
... -4 -3 -2 -1 0 1 2 3 4 ... -- key set
-----------------------------------
... -1 -1 -1 -1 0 1 1 1 1 ... -- series corresponding to signum
... 5 5 5 5 5 5 5 5 5 ... -- series corresponding to const 5
The first series could be efficiently expressed by a mapping [-infinity, -1] -> -1; [0, 0] -> 0; [1, infinity] -> 1 and the second one by [-infinity, infinity] -> 5. Now applying a combination function with (*) as arument function should give a new series
... -4 -3 -2 -1 0 1 2 3 4 ... -- key set
-----------------------------------
... -5 -5 -5 -5 0 5 5 5 5 ... -- combined series
The crucial point here—and all of the afore-mentioned packages don't seem to be able to do that—is that, when combining the key sets for these two series, you have to take the different values also into account. Both series span the full range of [-infinity, infinity] but it's necessary to break it into three parts for the final series.
There are also packages for working with intervals, e.g. the range package, which also provides an intersection operation on lists of intervals. However, I didn't found a way to use that in combination with one of the Map variants because it collapses adjacents intervals when doing calculations with them.
NB: Such a container is somewhat similar to a ZipList that extends to both sides, which is why I think it should also be possible to define a lawful Applicative instance for it, where <*> corresponds to the above-mentioned combining function.
To cut a long story short, is there already a package that provides such a container? Or is there an easy way to use the existing packages to build one?
The best suggestion from the comments above seems to be the step-function package, as suggested by B. Mehta. I haven't tried that package yet, but it looks like building a wrapper around that SF type is what I was looking for.
Meanwhile, I implemented another solution which I'd like to share. The code for the combining function (combineAscListWith in the code below) is a bit clumsy as it's more general than for just getting the intersection of both maps, so I'll sketch the idea:
First we need an Interval type with an Ord instance which stores pairs of Val a values which can either be -infinity, some value x or +infinity. Form that we can build an IntervalMap which is just a normal Map that maps these intervals to the final values.
When combining two such IntervalMaps by intersection, we first convert the maps into lists of key/value pairs. Next we traverse both lists in parallel to zip both lists into another one which corresponds to the final intersection map. There are two main cases when combining the list elements:
Both left-most intervals start at the same value. In that case we found an interval that actually overlaps/intersects. We clip the longer interval to the shorter one, and use the values associated with the two intervals to get the result value, which now—together with the shorter interval—goes into the result list. The rest of the longer interval goes back to the input lists.
One of the intervals starts at a smaller value than the other, which means we found a part of the two series that do not overlap. So for the intersection, all of the non-overlapping part of the interval (or even the whole interval) can be discared. The rest (if any) goes back to the input list.
For completeness, here's the full example code. Again, the code is rather clumsy; a step-function-based implementation would certainly be more elegant.
import Control.Applicative
import Data.List
import qualified Data.Map as Map
data Val a = NegInf | Val a | Inf deriving (Show, Read, Eq, Ord)
instance Enum a => Enum (Val a) where
succ v = case v of
NegInf -> NegInf
Val x -> Val $ succ x
Inf -> Inf
pred v = case v of
NegInf -> NegInf
Val x -> Val $ pred x
Inf -> Inf
toEnum = Val . toEnum
fromEnum (Val x) = fromEnum x
data Interval a = Interval { lowerBound :: Val a, upperBound :: Val a } deriving (Show, Read, Eq)
instance Ord a => Ord (Interval a) where
compare ia ib = let (a, a') = (lowerBound ia, upperBound ia)
(b, b') = (lowerBound ib, upperBound ib)
in case () of
_ | a' < b -> LT
_ | b' < a -> GT
_ | a == b && a' == b' -> EQ
_ -> error "Ord.Interval.compare: undefined for overlapping intervals"
newtype IntervalMap i a = IntervalMap { unIntervalMap :: Map.Map (Interval i) a }
deriving (Show, Read)
instance Functor (IntervalMap i) where
fmap f = IntervalMap . fmap f . unIntervalMap
instance (Ord i, Enum i) => Applicative (IntervalMap i) where
pure = IntervalMap . Map.singleton (Interval NegInf Inf)
(<*>) = intersectionWith ($)
intersectionWith :: (Ord i, Enum i) => (a -> b -> c)
-> IntervalMap i a -> IntervalMap i b -> IntervalMap i c
intersectionWith f = combineWith (liftA2 f)
combineWith :: (Ord i, Enum i) => (Maybe a -> Maybe b -> Maybe c)
-> IntervalMap i a -> IntervalMap i b -> IntervalMap i c
combineWith f (IntervalMap mpA) (IntervalMap mpB) =
let cs = combineAscListWith f (Map.toAscList mpA) (Map.toAscList mpB)
in IntervalMap $ Map.fromList [ (i, v) | (i, Just v) <- cs ]
combineAscListWith :: (Ord i, Enum i) => (Maybe a -> Maybe b -> c)
-> [(Interval i, a)] -> [(Interval i, b)] -> [(Interval i, c)]
combineAscListWith f as bs = case (as, bs) of
([], _) -> map (\(i, v) -> (i, f Nothing (Just v))) bs
(_, []) -> map (\(i, v) -> (i, f (Just v) Nothing)) as
((Interval a a', va) : as', (Interval b b', vb) : bs')
| a == b -> case () of
_ | a' == b' -> (Interval a a', f (Just va) (Just vb)) : combineAscListWith f as' bs'
_ | a' < b' -> (Interval a a', f (Just va) (Just vb)) : combineAscListWith f as' ((Interval (succ a') b', vb) : bs')
_ | a' > b' -> (Interval a b', f (Just va) (Just vb)) : combineAscListWith f ((Interval (succ b') a', va) : as') bs'
| a < b -> case () of
_ | a' < b -> ((Interval a a', f (Just va) Nothing)) :
(if succ a' == b then id else ((Interval (succ a') (pred b), f Nothing Nothing) :)) (combineAscListWith f as' bs)
_ | True -> (Interval a (pred b), f (Just va) Nothing) : combineAscListWith f ((Interval b a', va) : as') bs
| a > b -> case () of
_ | b' < a -> ((Interval b b', f Nothing (Just vb))) :
(if succ b' == a then id else ((Interval (succ b') (pred a), f Nothing Nothing) :)) (combineAscListWith f as bs')
_ | True -> (Interval b (pred a), f Nothing (Just vb)) : combineAscListWith f as ((Interval a b', vb) : bs')
showIntervalMap :: (Show i, Show a, Eq i) => IntervalMap i a -> String
showIntervalMap = intercalate "; " . map (\(i, v) -> showInterval i ++ " -> " ++ show v)
. Map.toAscList . unIntervalMap
where
showInterval (Interval (Val a) (Val b)) | a == b = "[" ++ show a ++ "]"
showInterval (Interval a b) = "[" ++ showVal a ++ " .. " ++ showVal b ++ "]"
showVal NegInf = "-inf"
showVal (Val x) = show x
showVal Inf = "inf"
main :: IO ()
main = do
let signumMap = IntervalMap $ Map.fromList [(Interval NegInf (Val $ -1), -1),
(Interval (Val 0) (Val 0), 0), (Interval (Val 1) Inf, 1)]
putStrLn $ showIntervalMap $ (*) <$> signumMap <*> pure 5

How can I check if a BST is valid?

How can I check if a BST is a valid one, given its definition and using a generalized version of fold for BST?
data(Ord a, Show a, Read a) => BST a = Void | Node {
val :: a,
left, right :: BST a
} deriving (Eq, Ord, Read, Show)
fold :: (Read a, Show a, Ord a) => (a -> b -> b -> b) -> b -> BST a -> b
fold _ z Void = z
fold f z (Node x l r) = f x (fold f z l) (fold f z r)
The idea is to check that a node value is greater then all values in left-subtree and smaller than all values in its right-subtree. This must be True for all nodes in the tree. A function bstList simply output the list of (ordered) values in the BST.
Of course something like this won't work:
--isBST :: (Read a, Show a, Ord a) => BST a -> Bool
isBST t = fold (\x l r -> all (<x) (bstList l) && all (>x) (bstList r)) (True) t
because, for example, applying the fold function to the node 19 ends up all (<19) (bstList True) && all (>19) (bstList True).
Your problem seems to be that you lose information because your function only returns a boolean when it examines the left and right subtrees. So change it to also return the minimum and maximum values of the subtrees. (This is probably more efficient as well, since you don't need to used bslist to check all elements anymore)
And make a wrapper function to ignore these "auxiliary" values after you are done, of course.
(Please don't put typeclass constraints on the data type.)
A BST is valid iff an in-order traversal is monotonically increasing.
flatten tree = fold (\a l r -> l . (a:) . r) id tree []
ordered list#(_:rest) = and $ zipWith (<) list rest
ordered _ = True
isBST = ordered . flatten
A nice way of encoding this is to lean on the traversal provided by Data.Foldable.
{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}
import Data.Foldable
import Data.Monoid
We can derive an instance of it automatically using an extension, but we need to reorder the fields of the Node constructor to provide us an in-order traversal.
While we're at it, we should eliminate the constraints on the data type itself. They actually provide no benefit, and has been removed from the language as of Haskell 2011. (When you want to use such constraints you should put them on instances of classes, not on the data type.)
data BST a
= Void
| Node
{ left :: BST a
, val :: a
, right :: BST a
} deriving (Eq, Ord, Read, Show, Foldable)
First we define what it means for a list to be strictly sorted.
sorted :: Ord a => [a] -> Bool
sorted [] = True
sorted [x] = True
sorted (x:xs) = x < head xs && sorted xs
-- head is safe because of the preceeding match.
Then we can use the toList method provided by Data.Foldable and the above helper.
isBST :: Ord a => BST a -> Bool
isBST = sorted . toList
We can also implement this more directly, like you asked. Since we removed the spurious constraints on the data type, we can simplify the definition of your fold.
cata :: (b -> a -> b -> b) -> b -> BST a -> b
cata _ z Void = z
cata f z (Node l x r) = f (cata f z l) x (cata f z r)
Now we need a data type to model the result of our catamorphism, which is that we either have no nodes (Z), or a range of strictly increasing nodes (T) or have failed (X)
data T a = Z | T a a | X deriving Eq
And we can then implement isBST directly
isBST' :: Ord a => BST a -> Bool
isBST' b = cata phi Z b /= X where
phi X _ _ = X
phi _ _ X = X
phi Z a Z = T a a
phi Z a (T b c) = if a < b then T a c else X
phi (T a b) c Z = if b < c then T a c else X
phi (T a b) c (T d e) = if b < c && c < d then T a e else X
This is a bit tedious, so perhaps it would be better to decompose the way we compose the interim states a bit:
cons :: Ord a => a -> T a -> T a
cons _ X = X
cons a Z = T a a
cons a (T b c) = if a < b then T a c else X
instance Ord a => Monoid (T a) where
mempty = Z
Z `mappend` a = a
a `mappend` Z = a
X `mappend` _ = X
_ `mappend` X = X
T a b `mappend` T c d = if b < c then T a d else X
isBST'' :: Ord a => BST a -> Bool
isBST'' b = cata phi Z b /= X where
phi l a r = l `mappend` cons a r
Personally, I'd probably just use the Foldable instance.
If you don't insist on using a fold you can do it like this:
ord Void = True
ord (Node v l r) = every (< v) l && every (> v) r && ord l && ord r where
every p Void = True
every p (Node v l r) = p v && every p l && every p r

Resources