Implementing a binary tree set - haskell

data Set a = Node | Tree (Set a) (Set a)
How can I implement a partition function which partitions the set into two sets, with all elements that satisfy the predicate on the left and the rest on the right as simply as possible?
Any help is appreciated.
Thanks in advance!

As noted in the comments, your type won't quite cut it. A reasonable implementation of a set as a tree might have type:
data Set a = Leaf | Node a (Set a) (Set a) deriving (Show)
^- note this extra `a`
where each internal Node x l r has all elements in l less than x and all elements in r greater than x.
You can partition such a Set recursively as follows:
partition :: (a -> Bool) -> Set a -> (Set a, Set a)
The Leaf case is easy, obviously:
partition _ Leaf = (Leaf, Leaf)
Here's how we do the Node case. For the sub-case where the predicate holds for the value x in the node, note that we want:
partition f (Node x l r) | f x = (Node x l1 r1, ...)
where l1 and r1 are the subsets of elements in l and r that satisfy the predicate, which we can get by recursively partitioning l and r.
where (l1, l2) = partition f l
(r1, r2) = partition f r
The Set invariant will be preserved here, because all elements in l, including those in the subset l1 are less than x; for the same reason, all elements in r1 are greater than x. The only missing piece is that we somehow need to combine l2 and r2 to form the second part of the tuple:
partition f (Node x l r) | f x = (Node x l1 r1, combine l2 r2)
Since combine is a function that takes two trees with all elements in the first tree less than all elements in the second tree, the following recursive function will do:
combine Leaf r' = r'
combine (Node x l r) r' = Node x l (combine r r')
The case where the predicate does not hold for x is handled similarly, giving the full definition:
data Set a = Leaf | Node a (Set a) (Set a)
partition :: (a -> Bool) -> Set a -> (Set a, Set a)
partition _ Leaf = (Leaf, Leaf)
partition f (Node x l r)
| f x = (Node x l1 r1, combine l2 r2)
| otherwise = (combine l1 r1, Node x l2 r2)
where (l1, l2) = partition f l
(r1, r2) = partition f r
combine Leaf r' = r'
combine (Node x l r) r' = Node x l (combine r r')
Here's the complete code plus a QuickCheck test that that partition function works as expected:
import Test.QuickCheck
import qualified Data.List (nub, partition, sort)
import Data.List (nub, sort)
data Set a = Leaf | Node a (Set a) (Set a) deriving (Show)
partition :: (a -> Bool) -> Set a -> (Set a, Set a)
partition _ Leaf = (Leaf, Leaf)
partition f (Node x l r)
| f x = (Node x l1 r1, combine l2 r2)
| otherwise = (combine l1 r1, Node x l2 r2)
where (l1, l2) = partition f l
(r1, r2) = partition f r
combine Leaf r' = r'
combine (Node x l r) r' = Node x l (combine r r')
insert :: (Ord a) => a -> Set a -> Set a
insert x Leaf = Node x Leaf Leaf
insert x (Node y l r) = case compare x y of
LT -> Node y (insert x l) r
GT -> Node y l (insert x r)
EQ -> Node y l r
fromList :: (Ord a) => [a] -> Set a
fromList = foldr insert Leaf
toList :: (Ord a) => Set a -> [a]
toList Leaf = []
toList (Node x l r) = toList l ++ x : toList r
prop_partition :: [Int] -> Bool
prop_partition lst =
let (l, r) = Main.partition even (fromList lst') in (toList l, toList r)
== Data.List.partition even (sort $ lst')
where lst' = nub lst
main = quickCheck (withMaxSuccess 10000 prop_partition)

Related

Intersection and difference function on sets using BST in haskell

I have implement the Set datatype using Binary search tree.
My implementation is as follows:-
data Set a = Empty | Node a (Set a) (Set a)
I have also written some other functions such as toList, fromList, and Insert. (took help in my previous question)
These are the functions that I have written until now:
insert :: Ord a => a -> Set a -> Set a
insert x Empty = Node x Empty Empty
insert x (Node e l r)
| x == e = Node e l r
| x < e = Node e (insert x l) r
| otherwise = Node e l (insert x r)
fromList :: Ord a => [a] -> Set a
fromList [] = Empty
fromList (x:xs) = insert x (fromList xs)
toList :: Set a -> [a]
toList Empty = []
toList (Node val l r) = toList l ++ val : (toList r)
I have been trying to solve the following functions for quite a long time now. Can you please help me out. I have made multiple attempts , but none of them work.
---- return the common elements between two Sets
intersection :: (Ord a) => Set a -> Set a -> Set a
-- all the elements in Set A *not* in Set B,
-- {1,2,3,4} `difference` {3,4} => {1,2}
-- {} `difference` {0} => {}
difference :: (Ord a) => Set a -> Set a -> Set a
Here's my attempt to this function, which compiles without an error , but it does not solve the problem:
intersection :: (Ord a) => Set a -> Set a -> Set a
intersection Empty Empty = Empty
intersection l Empty = Empty
intersection Empty r = Empty
intersection (Node val1 Empty Empty) (Node val2 Empty Empty) = if val1
== val2 then (Node val1 Empty Empty) else Empty
intersection l (Node val1 l1 r1) = Node val1 (intersection l l1) r1
I cannot import any external module/ libraries. I have also written a setmap function if that helps .
Thank you for your help.
Here is my solution, it's not the best in performance but work, solution is create a list and use element function inside filter function.
module Lib
(
insert, fromList, toList, element, intersection, difference
)
where
data Set a = Empty | Node a (Set a) (Set a) deriving Show
insert :: Ord a => a -> Set a -> Set a
insert x Empty = Node x Empty Empty
insert x (Node e l r)
| x == e = Node e l r
| x < e = Node e (insert x l) r
| otherwise = Node e l (insert x r)
fromList :: Ord a => [a] -> Set a
fromList = foldr insert Empty
toList :: Set a -> [a]
toList Empty = []
toList (Node val l r) = toList l ++ val : toList r
element :: Ord t => Set t -> t -> Bool
element Empty _ = False
element (Node v l r) x
| x == v = True
| x < v = element l x
| otherwise = element r x
intersection :: (Ord a) => Set a -> Set a -> Set a
intersection s1 s2 = intersect
where
ls2 = toList s2
intersect = fromList $ filter (element s1) ls2
difference :: (Ord a) => Set a -> Set a -> Set a
difference s1 s2 = diff
where
ls1 = toList s1
diff = fromList $ filter (not.element s2) ls1

Delete an element in a tree in Haskell

What I want to do is to take a binary search tree, and an element, and then remove it from the tree
(if it is present).
Here is my code:
data BinaryTree a = Null | Node (BinaryTree a) a (BinaryTree a)
deriving Show
type BSTree a = BinaryTree a
treeDelete :: (Ord a) => (BSTree a) -> a -> (BSTree a)
treeDelete a btree = case btree of
Null -> Null
Node Null val Null
|a==val -> Null
|otherwise -> Node Null val Null
Node left val right
|a==val-> Node left Null right
|otherwise -> (treeDelete a left) val (treeDelete a right)
You can:
Change you data structure definition as
data BinaryTree a = Null | Node (BinaryTree a) (Maybe a) (BinaryTree a)
which is a simplified trie keyed by [Bool].
Define a rotation behavior (like in Balanced Tree) to fill in the blank after delete. The rotation is not unique. For example, for the lazy Map in the container package, the deletion of a balanced tree is defined as:
data Map k a = Bin Size k a (Map k a) (Map k a)
| Tip
type Size = Int
delete :: Ord k => k -> Map k a -> Map k a
delete = go
where
go :: Ord k => k -> Map k a -> Map k a
go !_ Tip = Tip
go k t#(Bin _ kx x l r) =
case compare k kx of
LT | l' == l -> t
| otherwise -> balanceR kx x l' r
where l' = go k l
GT | r' == r -> t
| otherwise -> balanceL kx x l r'
where r' = go k r
EQ -> glue l r
balanceR, balanceL and glue are rotation to balance depth in a balanced tree.
http://hackage.haskell.org/package/containers-0.6.2.1/docs/Data-Map-Lazy.html

How do you represent nested types using the Scott Encoding?

An ADT can be represented using the Scott Encoding by replacing products by tuples and sums by matchers. For example:
data List a = Cons a (List a) | Nil
Can be encoded using the Scott Encoding as:
cons = (λ h t c n . c h t)
nil = (λ c n . n)
But I couldn't find how nested types can be encoded using SE:
data Tree a = Node (List (Tree a)) | Leaf a
How can it be done?
If the Wikipedia article is correct, then
data Tree a = Node (List (Tree a)) | Leaf a
has Scott encoding
node = λ a . λ node leaf . node a
leaf = λ a . λ node leaf . leaf a
It looks like the Scott encoding is indifferent to (nested) types. All it's concerned with is delivering the correct number of parameters to the constructors.
Scott encodings are basically representing a T by the type of its case expression. So for lists, we would define a case expression like so:
listCase :: List a -> r -> (a -> List a -> r) -> r
listCase [] n c = n
listCase (x:xs) n c = c x xs
this gives us an analogy like so:
case xs of { [] -> n ; (x:xs) -> c }
=
listCase xs n (\x xs -> c)
This gives a type
newtype List a = List { listCase :: r -> (a -> List a -> r) -> r }
The constructors are just the values that pick the appropriate branches:
nil :: List a
nil = List $ \n c -> n
cons :: a -> List a -> List a
cons x xs = List $ \n c -> c x xs
We can work backwards then, from a boring case expression, to the case function, to the type, for your trees:
case t of { Leaf x -> l ; Node xs -> n }
which should be roughly like
treeCase t (\x -> l) (\xs -> n)
So we get
treeCase :: Tree a -> (a -> r) -> (List (Tree a) -> r) -> r
treeCase (Leaf x) l n = l x
treeCase (Node xs) l n = n xs
newtype Tree a = Tree { treeCase :: (a -> r) -> (List (Tree a) -> r) -> r }
leaf :: a -> Tree a
leaf x = Tree $ \l n -> l x
node :: List (Tree a) -> Tree a
node xs = Tree $ \l n -> n xs
Scott encodings are very easy tho, because they're only case. Church encodings are folds, which are notoriously hard for nested types.

How to represent tree with sharing in Haskell

I would like to represent a "tree" of the following shape in Haskell:
/\
/\/\
/\/\/\
/\/\/\/\
` ` ` ` `
/ and \ are the branches and ` the leaves. You can see that starting at any node, following the left path, then the right gets you to the same node as following the right path then the left. You should be able to label the leaves, apply a function of the two decendants at each node, and propagate this information to the root in O(n^2) time. My naive efforts are giving me an exponential run time. Any hints?
It is certainly possible to construct a tree with shared nodes. For example, we could just define:
data Tree a = Leaf a | Node (Tree a) (Tree a)
and then carefully construct a value of this type as in
tree :: Tree Int
tree = Node t1 t2
where
t1 = Node t3 t4
t2 = Node t4 t5
t3 = Leaf 2
t4 = Leaf 3
t5 = Leaf 5
to achieve sharing of subtrees (in this case t4).
However, as this form of sharing is not observable in Haskell, it is very hard to maintain: for example if you traverse a tree to relabel its leaves
relabel :: (a -> b) -> Tree a -> Tree b
relabel f (Leaf x) = Leaf (f x)
relabel f (Node l r) = Node (relabel f l) (relabel f r)
you loose sharing. Also, when doing a bottom-up computation such as
sum :: Num a => Tree a -> a
sum (Leaf n) = n
sum (Node l r) = sum l + sum r
you end up not taking advantage of sharing and possibly duplicate work.
To overcome these problems, you can make sharing explicit (and hence observable) by encoding your trees in a graph-like manner:
type Ptr = Int
data Tree' a = Leaf a | Node Ptr Ptr
data Tree a = Tree {root :: Ptr, env :: Map Ptr (Tree' a)}
The tree from the example above can now be written as
tree :: Tree Int
tree = Tree {root = 0, env = fromList ts}
where
ts = [(0, Node 1 2), (1, Node 3 4), (2, Node 4 5),
(3, Leaf 2), (4, Leaf 3), (5, Leaf 5)]
The price to pay is that functions that traverse these structures are somewhat cumbersome to write, but we can now define for example a relabeling function that preserves sharing
relabel :: (a -> b) -> Tree a -> Tree b
relabel f (Tree root env) = Tree root (fmap g env)
where
g (Leaf x) = Leaf (f x)
g (Node l r) = Node l r
and a sum function that doesn't duplicate work when the tree has shared nodes:
sum :: Num a => Tree a -> a
sum (Tree root env) = fromJust (lookup root env')
where
env' = fmap f env
f (Leaf n) = n
f (Node l r) = fromJust (lookup l env') + fromJust (lookup r env')
Perhaps you can represent it simply as a list of leaves and apply the function level by level until you're down to one value, i.e. something like this:
type Tree a = [a]
propagate :: (a -> a -> a) -> Tree a -> a
propagate f xs =
case zipWith f xs (tail xs) of
[x] -> x
xs' -> propagate f xs'

How can I check if a BST is valid?

How can I check if a BST is a valid one, given its definition and using a generalized version of fold for BST?
data(Ord a, Show a, Read a) => BST a = Void | Node {
val :: a,
left, right :: BST a
} deriving (Eq, Ord, Read, Show)
fold :: (Read a, Show a, Ord a) => (a -> b -> b -> b) -> b -> BST a -> b
fold _ z Void = z
fold f z (Node x l r) = f x (fold f z l) (fold f z r)
The idea is to check that a node value is greater then all values in left-subtree and smaller than all values in its right-subtree. This must be True for all nodes in the tree. A function bstList simply output the list of (ordered) values in the BST.
Of course something like this won't work:
--isBST :: (Read a, Show a, Ord a) => BST a -> Bool
isBST t = fold (\x l r -> all (<x) (bstList l) && all (>x) (bstList r)) (True) t
because, for example, applying the fold function to the node 19 ends up all (<19) (bstList True) && all (>19) (bstList True).
Your problem seems to be that you lose information because your function only returns a boolean when it examines the left and right subtrees. So change it to also return the minimum and maximum values of the subtrees. (This is probably more efficient as well, since you don't need to used bslist to check all elements anymore)
And make a wrapper function to ignore these "auxiliary" values after you are done, of course.
(Please don't put typeclass constraints on the data type.)
A BST is valid iff an in-order traversal is monotonically increasing.
flatten tree = fold (\a l r -> l . (a:) . r) id tree []
ordered list#(_:rest) = and $ zipWith (<) list rest
ordered _ = True
isBST = ordered . flatten
A nice way of encoding this is to lean on the traversal provided by Data.Foldable.
{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}
import Data.Foldable
import Data.Monoid
We can derive an instance of it automatically using an extension, but we need to reorder the fields of the Node constructor to provide us an in-order traversal.
While we're at it, we should eliminate the constraints on the data type itself. They actually provide no benefit, and has been removed from the language as of Haskell 2011. (When you want to use such constraints you should put them on instances of classes, not on the data type.)
data BST a
= Void
| Node
{ left :: BST a
, val :: a
, right :: BST a
} deriving (Eq, Ord, Read, Show, Foldable)
First we define what it means for a list to be strictly sorted.
sorted :: Ord a => [a] -> Bool
sorted [] = True
sorted [x] = True
sorted (x:xs) = x < head xs && sorted xs
-- head is safe because of the preceeding match.
Then we can use the toList method provided by Data.Foldable and the above helper.
isBST :: Ord a => BST a -> Bool
isBST = sorted . toList
We can also implement this more directly, like you asked. Since we removed the spurious constraints on the data type, we can simplify the definition of your fold.
cata :: (b -> a -> b -> b) -> b -> BST a -> b
cata _ z Void = z
cata f z (Node l x r) = f (cata f z l) x (cata f z r)
Now we need a data type to model the result of our catamorphism, which is that we either have no nodes (Z), or a range of strictly increasing nodes (T) or have failed (X)
data T a = Z | T a a | X deriving Eq
And we can then implement isBST directly
isBST' :: Ord a => BST a -> Bool
isBST' b = cata phi Z b /= X where
phi X _ _ = X
phi _ _ X = X
phi Z a Z = T a a
phi Z a (T b c) = if a < b then T a c else X
phi (T a b) c Z = if b < c then T a c else X
phi (T a b) c (T d e) = if b < c && c < d then T a e else X
This is a bit tedious, so perhaps it would be better to decompose the way we compose the interim states a bit:
cons :: Ord a => a -> T a -> T a
cons _ X = X
cons a Z = T a a
cons a (T b c) = if a < b then T a c else X
instance Ord a => Monoid (T a) where
mempty = Z
Z `mappend` a = a
a `mappend` Z = a
X `mappend` _ = X
_ `mappend` X = X
T a b `mappend` T c d = if b < c then T a d else X
isBST'' :: Ord a => BST a -> Bool
isBST'' b = cata phi Z b /= X where
phi l a r = l `mappend` cons a r
Personally, I'd probably just use the Foldable instance.
If you don't insist on using a fold you can do it like this:
ord Void = True
ord (Node v l r) = every (< v) l && every (> v) r && ord l && ord r where
every p Void = True
every p (Node v l r) = p v && every p l && every p r

Resources