I'm making a Haskell function to delete a node from a Binary Search Tree.
I know the rules regarding the action needed to be taken depending on the number children
the targeted parent has.
no children - delete,
1 child - replace with the child,
2 children - find the min in the right sub tree and replace the node with the value,
- then, recursively delete the minimum value from the right sub-tree
data BST = MakeNode BST String BST
| Empty
deleteNode :: String -> BST
treeBuilder :: [String] -> BST
treeBuilder = foldr add Empty
add :: String -> BST -> BST
add new Empty = (MakeNode Empty new Empty)
add string tree#(MakeNode left value right)
| string > value = MakeNode left value (add string right)
| string < value = MakeNode (add string left) value right
| otherwise = tree
can't figure out why treeBuilder isn't working correctly either. It just prints Strings Diagonally down to the right.
In these situations, it's best not to think about deleting a node from the tree; it's better to think of how to transform the tree you have into one without the node you want gone.
Let's do some case analysis:
If the tree is empty, then the result is empty, regardless of the key:
delete _ Empty = Empty
If the tree is non-empty, we have to see if the key matches the node. If it does not match, then we need to transform either the left or right subtree based upon whether the key is greater-than or less-than the node:
delete key (MakeNode l key' r) | key < key' = MakeNode (delete key l) key' r
delete key (MakeNode l key' r) | key > key' = MakeNode l key' (delete key r)
If it does match (which it must, since all of the no-match cases have been dealt with), then we have to figure out how to create a new tree without the root node. From your description, if the node has no children, just delete it:
delete _ (MakeNode Empty _ Empty) = Empty
If the node has one child, use that:
delete _ (MakeNode l _ Empty) = l
delete _ (MakeNode Empty _ r) = r
Otherwise, find and delete the minimum key in the right subtree, and use it as the new root's key:
delete _ (MakeNode l _ r) = MakeNode l key r' -- make a new root with min key and new r
where key = minKey r -- find the minimum key in the right subtree
r' = delete key r -- new right subtree with min key removed
-- a helper function to find the minimum key in a tree
-- !! does not work on Empty tree !!
minKey (MakeNode Empty key _) = key
minKey (MakeNode l _ _) = minKey l
You can't! Everything is immutable!
What you can do is make a new tree that's exactly the same as the old one, except with one node removed. (Don't worry, your compiler won't need to duplicate much memory. Remember, everything is immutable. That means that the implementation can safely re-use the common parts!)
As such, your deleteNode function won't be of type String -> BST, it will be of type String -> BST -> BST. The String is the label you want removed, the first BST is the input tree, the second BST is the output tree.
As #Ingo mentioned, you can implement deletion recursively by implementing the function:
deleteNode :: String -> BST -> BST
deleteNode _ Empty = ... -- Handle the empty case
deleteNode x (BST left a right) | x == a = ... -- Delete the node
| x < a = ... -- Recurse on the lesser node
| otherwise = ... -- Recurse on the greater node
If you want to do some general munging beyond deletion (insertion, changes, etc.) in a traversable data structure (trees, lists, etc) I suggest you read up on zippers. They'll help you immensely.
Once you have a zipper for a binary tree, you can use zipper functions to delete nodes in the tree. If you'd like help implementing a zipper for your binary search tree data structure, let me know and I'll expand this. Right now it's probably overkill.
Be warned, a zipper won't re-balance your binary search tree for you. If you want to remove a node from your binary search tree and keep it balanced, that's a whole new can of worms.
There are a number of common balancing algorithms you could use, depending upon your taste. I suggest getting it working in an unbalanced fashion first, and then asking separate questions if you have trouble balancing it.
And, of course, if you want an efficient, out-of-the-box, already-implemented, balancing binary search tree in haskell -- just import Data.Map!
Here is a deletion function implemented in Haskell using Mutual Recursion
The type of the tree is:
type Key = Int
data BST = Nil | Node Key BST BST deriving (Show, Eq)
and here is the delete function:
delete :: Key -> BST -> BST
delete k Nil = Nil
delete k x#(Node a l r)
| (k < a) = Node a (delete k l) r
| (k > a) = Node a l (delete k r)
| (k == a) = delete' k x
delete' :: Key -> BST -> BST
delete' k (Node a l r)
| (l == Nil) = r
| (r == Nil) = l
| otherwise = let (k,t) = maxAndDelete l
in Node k t r
-- This function finds the maximum and then deletes the node as well
maxAndDelete :: BST -> (Key,BST)
maxAndDelete t = let m = treeMaximum t
in (m,delete m t)
Related
I have trees of the form:
data Tree a = Leaf | Node (Tree a) a (Tree a)
I created a function to look up the value of a node in a tree based on an in-order, left-to-right traversal.
getElem :: Tree a -> Int -> Maybe a
getElem Leaf _ = Nothing
getElem (Node l x r) n
| s == n = Just x
| n < s = getElem l n
| otherwise = getElem r (n - s - 1)
where
s = size l
I now want to write a method to be able to update a tree. It should be able to take in a tree, an index and a value and update the node at that index with the value. So far I have:
update :: Tree a -> Int -> a -> Tree a
update Leaf _ _ = Leaf
update (Node l x r) index c
| s == index = (Node l c r)
| index < s = update l index c
| otherwise = update r (index - s - 1) c
where
s = size l
This function is able to add but it obviously returns just the added node itself. I want to be able to return the entire tree post the 'update' with the new node or return the tree as is if the index is out of bounds.
Could anybody give me some idea how to proceed with this?
Edit 1:
Okay, I understand that I am basically discarding the remainder of my tree when recursing here. So:
update :: Tree a -> Int -> a -> Tree a
update Leaf _ _ = Leaf
update (Node l x r) index c
| s == index = (Node l c r)
| index < s = update (Node l2 x r) index c
| otherwise = update (Node l x r2) (index - s - 1) c
where
s = size l
l2 = l
r2 = r
Edit 2(Silly me!):
update :: Tree a -> Int -> a -> Tree a
update Leaf _ _ = Leaf
update (Node l x r) index c
| s == index = (Node l c r)
| index < s = (Node (upd l index c) x r)
| otherwise = (Node l x (upd r (index - s - 1) c))
where
s = size l
It took me a bit to wrap my head around it. Thank you for the comments!
Since in Haskell all data is immutable, you can not "update" a tree, you construct a new tree. That tree might however have references to subtrees of the old tree. You thus do not per se construct a completely new tree.
You managed to create an "updated" node, so now the only thing that is missing, is to use the "updated" subtree in a new tree. In that tree, you can use the "old" value together with the other subtree to construct a new one, like:
update :: Tree a -> Int -> a -> Tree a
update Leaf _ _ = Leaf
update (Node l x r) index c
| s == index = Node l c r
| index < s = Node (update l index c) x r
| otherwise = Node l x (update r (index - s - 1) c)
where s = size l
You might also need to change the Leaf case if you "count" a leaf as a node.
Trees with indices are not very common. It might also - in order to boost performance - be better to keep track of the number of items in the left subchild (or both), since then we can just pick the left or right subchild without counting the children. By keeping track of the number, for a complete tree, then updating the tree is an O(log n) operation, not an O(n) operation.
I am having a huge problem with this. I don't have any idea how to make Huffman tree since it is being built bottom-up (from the liefs to the root).
I am new to Haskell and functional programming. I have seen there are other posts similar to mine, but they did not help me.
This is my code
import Data.Map
type Value = Int
type Key = [Char]
type NodeValue = (Key,Value)
data Heap_ a = Empty
| Node a (Heap_ a) (Heap_ a)
deriving(Show, Eq)
type Heap a = Heap_ NodeValue
frequencyOfCharacters :: [Char] -> Map Key Value
frequencyOfCharacters [] = Data.Map.empty
frequencyOfCharacters (character:text) = insertWith (+) [character] 1 (frequencyOfCharacters text)
makeLeaf :: NodeValue -> Heap a
makeLeaf a = Node a Empty Empty
mergeHeaps :: Heap a -> Heap a -> Heap a
mergeHeaps Empty rightHeap = rightHeap
mergeHeaps leftHeap Empty = leftHeap
mergeHeaps leftHeap#(Node a lefta righta) rightHeap#(Node b leftb rightb)
| snd a < snd b = Node a (mergeHeaps lefta rightHeap) righta
| otherwise = Node b leftb (mergeHeaps leftHeap rightb)
addToHeap :: Heap a->NodeValue->Heap a
addToHeap Empty a = makeLeaf a
addToHeap h a = mergeHeaps h (makeLeaf a)
takeHeadFromHeap :: Heap a -> (NodeValue,Heap a)
takeHeadFromHeap Empty = (("",-1), Empty)
takeHeadFromHeap (Node a leftBranch rightBranch) = (a, mergeHeaps leftBranch rightBranch)
makeHeap :: Map Key Value -> Heap a
makeHeap map_ = makeHeap_ $ toList map_
makeHeap_ :: [(Key,Value)] -> Heap a
makeHeap_ [] = Empty
makeHeap_ (x:xs) = addToHeap (makeHeap_ xs) x
huffmanEntry :: [Char]-> Heap a
huffmanEntry text = makeHeap $ frequencyOfCharacters text
I am thinking about this data structure for Huffman tree
data HuffmanTree h = Leaf [Char]
| NodeHuff [Char] (HuffmanTree h) (HuffmanTree h)
deriving(Show, Eq)
but i have no idea how to make Huffman tree from min heap.
After this line of code in ghci min heap is made from input string
*Main> huffmanEntry "Aasdqweqweasd"
You need to make a Huffman Tree with a min heap, and you said "I have no idea how to make Huffman Tree from min heap". Let's figure out what you need to do before you start coding, especially in a language that you might not be familiar with.
I suppose we should check the internet for a way to make a Huffman Tree. How about the Wikipedia page on Huffman Coding? (https://en.wikipedia.org/wiki/Huffman_coding)
The simplest construction algorithm uses a priority queue where the
node with lowest probability is given highest priority:
Create a leaf node for each symbol and add it to the priority queue.
While there is more than one node in the queue:
Remove the two nodes of highest priority (lowest probability) from
the queue
Create a new internal node with these two nodes as
children and with probability equal to the sum of the two nodes'
probabilities.
Add the new node to the queue.
The remaining node is the root node and the tree is complete.
You already have code in place to find the frequency of each symbol in a given string - that's your frequencyOfCharacters function.
All you need now is a priority queue! You can definitely find a way to implement a priority queue using a min heap.
I hope this helps you piece the logic together.
If you want to deal with the problem step-by-step, why don't you start by trying to make a Huffman Tree using a working implementation of a priority queue (http://hackage.haskell.org/package/PSQueue)?
Once you're done with that, you can try to replace this readymade module with a small queue module of your own using a working implementation of a min heap (http://hackage.haskell.org/package/heap).
Finally, you can write a barebones min heap module by yourself (you have a lot of the code already) and replace the external heap module with that.
Update: Some more concrete suggestions on how to build the tree. This requires a little setup, so please bear with me. Suppose you have a Tree.hs module that allows you to work with binary trees:
module Tree where
-- Binary Tree
data Tree k v =
Empty
| Node (k, v) (Tree k v) (Tree k v)
deriving ( Show )
-- takes a (key, value) pair and returns a binary tree
-- containing one node with that pair
singleton :: (k, v) -> Tree k v
singleton = undefined
-- takes three things: a (key, value) pair, a binary tree t1
-- and another binary tree t2
-- then it constructs the tree
-- (key, val)
-- / \
-- t1 t2
joinWith :: (k, v) -> Tree k v -> Tree k v -> Tree k v
joinWith = undefined
-- returns the value associated with the (key, value) pair
-- stored in the root node of the binary tree
value :: Tree k v -> v
value = undefined
and you also have a Queue.hs module which lets you work with priority queues (I'm assuming you have a working min-heap module)
module Queue where
import Heap
-- a priority queue
type Queue k v = Heap k v
-- returns an empty queue
empty :: (Ord v) => Queue k v
empty = undefined
-- adds a (key, value) pair to the queue and returns a
-- new copy of the queue containing the inserted pair
enqueue :: (Ord v) => (k, v) -> Queue k v -> Queue k v
enqueue = undefined
-- removes the lowest-value (key, value) pair from the queue
-- and returns a tuple consisting of the removed pair
-- and a copy of the queue with the pair removed
dequeue :: (Ord v) => Queue k v -> ((k, v), Queue k v)
dequeue = undefined
-- returns the number of elements in the queue
size :: (Ord v) => Queue k v -> Int
size = undefined
Then this is how you might try to make a Huffman.hs module using the tools at your disposal.
module Huffman where
import Queue
import Tree
type HuffmanTree = Tree Char Int
-- takes a list of (character, frequency) pairs and turns them into
-- a Huffman Tree
makeHuffmanTree :: [(Char, Int)] -> HuffmanTree
makeHuffmanTree pairs = let
nodeList = map (\pair -> (singleton pair, snd pair)) pairs
nodeQueue = foldr enqueue empty nodeList
in
reduceNodes nodeQueue
-- takes pairs of nodes from the queue and combines them
-- till only one node containing the full Huffman Tree is
-- present in the queue
-- then this last node is dequeued and returned
reduceNodes :: Queue HuffmanTree Int -> HuffmanTree
reduceNodes q
| size q == 0 = error "no nodes!"
| size q == 1 = fst (fst (dequeue q))
| otherwise = let
((tree1, freq1), q') = dequeue q
((tree2, freq2), q'') = dequeue q'
freqSum = freq1 + freq2
newTree = joinWith ('.', freqSum) tree1 tree2
in
reduceNodes (enqueue (newTree, freqSum) q'')
Since the types check out, I successfully compiled a stack project with these modules. When you think you have the Huffman Tree-building code you want, you can just fill in the undefined functions with what they're actually supposed to do and you're good!
I have the defined Type: data Tree = Node Tree Tree | Leaf Int | NIL. I want create a method delete :: Tree -> Int -> Tree which removes all Leaf's with the specific Int given in the second parameter.
If your tree doesn't have any particular structure, you can do
delete NIL _ = NIL
delete (Leaf i) int | i == int = NIL
| otherwise = Leaf i
delete (Node left right) int = Node (delete left int) (delete right int)
Why?
delete NIL _ = NIL because we have to deal with all cases, even the empty trees at the ends. The _ stands for any value that we don't care about.
delete (Leaf i) int | i == int = NIL
| otherwise = Leaf i
because we need to first check | i== int to see whether we want to delete the node. If we do, we replace it with the empty three, NIL. Otherwise, we leave it alone.
delete (Node left right) int = Node (delete left int) (delete right int) because if we're at a node, we need to delete the int from both left and right subtrees.
Aren't you going to end up with a whole load of NILs?
Yes, I suppose that could happen. You could clear up with
prune (Node NIL NIL ) = NIL
prune (Node (Leaf i) NIL ) = Leaf i
prune (Node NIL (Leaf i)) = Leaf i
prune (Node (Leaf i) (Leaf j)) = Node (Leaf i) (Leaf j)
prune (Node left right ) = prune (Node (prune left) (prune right))
prune t = t
The first three lines get rid of a NIL on the left, right or both, and the fourth leaves two leaves alone.
The fifth line only gets called when one of the left or right subtrees of this node is itself a node. Why prune three times? Maybe when you prune left and prune right one or more of them ends up NIL.
The prune t = t deals with both NIL and Leaf i in one neat pattern match.
I would suggest some improvements to AndrewC's answer. While his solution is absolutely correct, it has some potential performance issue.
The issue is: both delete and prune functions create a new copy of the whole tree on every call. That happens regardless of whether an element was actually deleted.
The worst case scenario is deleting a non-existing element.
Let's say we have a really big tree that holds 1M of integers. Since integers are stored in leaves only, the whole tree contains at least 2M-1 of nodes. (Or even more if the tree was not pruned yet thus contains NIL nodes).
When we try to delete a non-existing element from such a huge tree, our delete function will do absolutely nothing but duplicating all 2M of nodes. (And prune will duplicate them again!)
Deleting an existing element is just a tiny bit better. At this case, delete removes one leaf, updates it's parent and duplicates the rest of the tree. prune will probably remove a few more nodes but will duplicate the rest.
Why does that happen?
There are two places where duplication happens.
This line creates a new tree that is completely identical to the argument:
delete (Leaf i) int | ...
| otherwise = Leaf i
Also, this line creates a new tree even if the element is not present in both left and right branches:
delete (Node left right) int = Node (delete left int) (delete right int)
Is it possible to avoid unnecessary duplication?
Yes, of course. We just need to return the argument if we do not modify it.
Here is my version:
delete t i = fst $ delete' t i
where delete' NIL _ = (NIL, True)
delete' t#(Leaf i) int | i == int = (NIL, False)
| otherwise = (t, True)
delete' t#(Node left right) int =
let (left', unchangedLeft) = delete' left int
(right', unchangedRight) = delete' right int
in
if unchangedLeft && unchangedRight
then (t, True)
else (Node left' right', False)
As you can see, I use a helper function delete' that returns a pair of (Tree, Bool) where second element is True if the tree was not changed, and False otherwise.
This function builds a new tree that shares most of the nodes with the original one. It only changes nodes on the path from root to the deleted element.
What about prune ?
The version above does not delete NIL elements. As AndrewC noted, after performing multiple deletes we may have a tree with a lot of NILs. To address this issue, we can either modify prune in the similar way, or we can merely integrate it into the delete:
delete t i = fst $ delete'' t i
where delete'' NIL _ = (NIL, True)
delete'' t#(Leaf i) int | i == int = (NIL, False)
| otherwise = (t, True)
delete'' t#(Node left right) int =
let (left', unchangedLeft) = delete'' left int
(right', unchangedRight) = delete'' right int
in
if unchangedLeft && unchangedRight
then (t, True)
else (newNode left' right', False)
newNode NIL r = r
newNode l NIL = l
newNode l r = Node l r
Assume I have a binary tree:
data Bst a = Empty | Node (Bst a) a (Bst a)
I have to write a function that searches for a value and returns the number of its children. If there is no node with this value, it returns -1. I was trying to write both BFS and DFS, and I failed with both.
Pattern matching is your friend. Your Bst can either be Empty or a Node, so at the toplevel, your search function will be
search Empty = ...
search (Node left x right) = ...
Can an Empty tree possibly contain the target value? With a Node the target value, if present, will be either the node value (x above), in the left subtree, in the right subtree—or perhaps some combination of these.
By “return[ing] the number of its children,” I assume you mean the total number of descendants of the Bst rooted at a Node whose value is the target, which is an interesting combination of problems. You will want another function, say numChildren, whose definition uses pattern matching as above. Considerations:
How many descendants does an Empty tree have?
In the Node case, x doesn’t count because you want descendants. If only you had a function to count the number of children in the left and right subtrees …
Here is a way to do this. Breath-first search can actually be a bit tricky to implement and this solution (findBFS) has aweful complexity (appending to the list is O(n)) but you'll get the gist.
First I have decided to split out the finding functions to return the tree where the node element matches. That simplifies splitting out the counting function. Also, it is easier to return the number of elements than the number of descendants and return -1 in case not found, so the numDesc functions rely on the numElements function.
data Tree a = Empty
| Node a (Tree a) (Tree a)
numElements :: Tree a -> Int
numElements Empty = 0
numElements (Node _ l r) = 1 + numElements l + numElements r
findDFS :: Eq a => a -> Tree a -> Tree a
findDFS _ Empty = Empty
findDFS x node#(Node y l r) | x == y = node
| otherwise = case findDFS x l of
node'#(Node _ _ _) -> node'
Empty -> findDFS x r
findBFS :: Eq a => a -> [Tree a] -> Tree a
findBFS x [] = Empty
findBFS x ((Empty):ts) = findBFS x ts
findBFS x (node#(Node y _ _):ts) | x == y = node
findBFS x ((Node _ l r):ts) = findBFS x (ts ++ [l,r])
numDescDFS :: Eq a => a -> Tree a -> Int
numDescDFS x t = numElements (findDFS x t) - 1
numDescBFS :: Eq a => a -> Tree a -> Int
numDescBFS x t = numElements (findBFS x [t]) - 1
Can someone tell me why this code isn't producing what I want.
data BST = MakeNode BST String BST
| Empty
add :: String -> BST -> BST
add new Empty = (MakeNode Empty new Empty)
add string tree#(MakeNode left value right)
| string > value = MakeNode left value (add string right)
| string < value = MakeNode (add string left) value right
| otherwise = tree
output
"John"
"Doug"
"Charlie"
"Alice"
listToBST :: [String] -> BST
listToBST = foldr add Empty
If we create and function which takes a BST and returns a list in sorted order, modelled after sort . nub, then your Tree is fine as quickcheck tells us. QuickCheck is very easy to use.
import Data.List
import Test.QuickCheck
data BST = MakeNode BST String BST
| Empty
deriving (Show)
add :: String -> BST -> BST
add new Empty = (MakeNode Empty new Empty)
add string tree#(MakeNode left value right)
| string > value = MakeNode left value (add string right)
| string < value = MakeNode (add string left) value right
| otherwise = tree
test = ["alice", "blup", "test", "aa"]
manual_test = inorder (foldr add Empty test) == sort (nub test)
prop_inorder = property inorder_test
where inorder_test :: [String] -> Bool
inorder_test xs = inorder (foldr add Empty xs) == sort (nub xs)
-- return sorted nodes
inorder :: BST -> [String]
inorder (Empty) = []
inorder (MakeNode l x r) = inorder l ++ (x : inorder r)
Just load ghci and then run quickCheck prop_inorder.
Other useful functions are:
reverseOrder :: BST -> [String]
reverseOrder Empty = []
reverseOrder (MakeNode l x r) = reverseOrder r ++ (x : reverseOrder r)
asList :: BST -> [String]
asList Empty = []
asList (MakeNode l x r) = x : (asList l ++ asList r)
And also think about making your tree more general by parameterizing over a:
data BST a = Empty | MakeNode (BST a) a (BST a)
You can make it than an instance of Functor, Monad, Foldable and all kind of handy typeclasses.
I tried it and it seems ok to me. It could help if you gave an example of an input that it doesn't work for.
I think the problem may be that string comparison does not work the way you expect ("123" < "7" because "1" < "7"). If I'm right, you might want to use Ints instead of Strings or even better, the class Ord of all the types that can be ordered using (<).