I need to make an implementation of a priority queue with a Heap Tree in Haskell, for example:
Given a list: [3,2,7,8,4,1,9]
3 is the main root
2 is its left leaf
7 is its right leaf
8 is the left leaf of 2
4 is the right leaf of 2
1 is the left leaf of 7
9 is the right leaf of 7
If I want to heapifiy the tree it would be like this:
7 > 3 so we exchange them
8 > 2 we exchange them
8 > 7 we exchange them
9 > 3 we exchange them
9 > 8 we exchange them
We end with a list like this: [9,7,8,2,4,1,3]
And 9 is the element with the highest number (priority) in our queue.
I will need to do this:
insert h e that inserts the element e in the heap h (in the last position)
delete h that removes the element with the highest priority (in our example 9)
heapify h that heapifies the tree.
But my problem is the heapify function, I dont even know where to start. That's why im asking for clues or advice.
module Heapify where
Let's use the tree type
data Tree a = Leaf a | Node (Tree a) a (Tree a)
deriving Show
and the example tree
ourTree = Node (Node (Leaf 8) 2 (Leaf 4)) 3 (Node (Leaf 1) 7 (Leaf 9))
And work out how to heapify it.
Your description in pictures
Top node
Left subtree
Right subtree
Heapified?
In this case, the result is indeed a heap, but this method isn't the standard way of doing heapification, and doesn't generalise (as far as I can tell) to something that does make sure you have a heap. Credit to Will Ness for pointing this out.
How should we heapify?
A tree satisfies the heap property if each parent node is no smaller than its child nodes. (It says nothing about the compararive sizes of the child nodes.)
Heapification actually works a bit like insertion sort, in that you start at the low end, and work gradually up, dragging small elements back into place as you introduce them.
Step 1: Heapify the left and right subtrees
Step 2: This node: Check if the top value should be pulled down
Step 3: If so, heapify at that side again
Steps 1,2 and 4 are just recursive calls, so let's concentrate on the top node:
Top node
We need to (a) see the value at the top of the subtrees and (b) be able to replace it.
atTop :: Tree a -> a
atTop (Leaf a) = a
atTop (Node _ a _) = a
replaceTop :: Ord a => Tree a -> a -> Tree a
replaceTop (Leaf _) a = Leaf a
replaceTop (Node l _ r) a = heapify (Node l a r)
Notice the cheeky forward reference to heapify? When we replace the top node of a tree, we need to re-heapify it to make sure it's still a tree.
Now let's see how to adjust at the left hand side if necessary.
It's necessary if the top of the left subtree, topL, is larger than the value a at the node. If it's <= we don't need to do anything, so leave the node alone.
adjustLeft :: Ord a => Tree a -> Tree a
adjustLeft (Leaf a) = Leaf a -- But we shouldn't ask to do this.
adjustLeft node#(Node l a r)
| topL <= a = node
| otherwise = Node (replaceTop l a) topL r
where topL = atTop l
and at the right:
Now let's adjust at the right hand side if necessary. This works exactly the same.
adjustRight :: Ord a => Tree a -> Tree a
adjustRight (Leaf a) = Leaf a -- But we shouldn't ask to do this.
adjustRight node#(Node l a r)
| topR <= a = node
| otherwise = Node l topR (replaceTop r a)
where topR = atTop r
Let's see some of that working:
*Heapify> ourTree
Node (Node (Leaf 8) 2 (Leaf 4)) 3 (Node (Leaf 1) 7 (Leaf 9))
*Heapify> atTop ourTree
3
Pull down to the Left or right?
If the current value belongs lower down the tree, we need to pull it down the left or the right side, by swapping it with the larger value of the two. We pick the larger value so we know it's more than the top value in the left subtree.
doTop :: Ord a => Tree a -> Tree a
doTop (Leaf a) = Leaf a
doTop node#(Node l a r)
| atTop l > atTop r = adjustLeft node
| otherwise = adjustRight node
Remember that adjustLeft and adjustRight make a recursive call to heapify.
Return of the heapification
So to heapify, we just
heapify :: Ord a => Tree a -> Tree a
heapify (Leaf a) = Leaf a
heapify (Node l a r) = doTop (Node (heapify l) a (heapify r))
OK, that was easy. Let's test it:
*Heapify> ourTree
Node (Node (Leaf 8) 2 (Leaf 4)) 3 (Node (Leaf 1) 7 (Leaf 9))
*Heapify> heapify ourTree
Node (Node (Leaf 2) 8 (Leaf 4)) 9 (Node (Leaf 1) 7 (Leaf 3))
Related
I recently started using Haskell and it will probably be for a short while. Just being asked to use it to better understand functional programming for a class I am taking at Uni.
Now I have a slight problem I am currently facing with what I am trying to do. I want to build it breadth-first but I think I got my conditions messed up or my conditions are also just wrong.
So essentially if I give it
[“A1-Gate”, “North-Region”, “South-Region”, “Convention Center”, “Rectorate”, “Academic Building1”, “Academic Building2”] and [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2], my tree should come out like
But my test run results are haha not what I expected. So an extra sharp expert in Haskell could possibly help me spot what I am doing wrong.
Output:
*Main> l1 = ["A1-Gate", "North-Region", "South-Region", "Convention Center",
"Rectorate", "Academic Building1", "Academic Building2"]
*Main> l3 = [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]
*Main> parkingtree = createBinaryParkingTree l1 l3
*Main> parkingtree
Node "North-Region" 0.5
(Node "A1-Gate" 0.0 EmptyTree EmptyTree)
(Node "Convention Center" 0.3
(Node "South-Region" 0.7 EmptyTree EmptyTree)
(Node "Academic Building2" 1.4
(Node "Academic Building1" 1.2 EmptyTree EmptyTree)
(Node "Rectorate" 0.6 EmptyTree EmptyTree)))
A-1 Gate should be the root but it ends up being a child with no children so pretty messed up conditions.
If I could get some guidance it would help. Below is what I've written so far::
data Tree = EmptyTree | Node [Char] Float Tree Tree deriving (Show,Eq,Ord)
insertElement location cost EmptyTree =
Node location cost EmptyTree EmptyTree
insertElement newlocation newcost (Node location cost left right) =
if (left == EmptyTree && right == EmptyTree)
then Node location cost (insertElement newlocation newcost EmptyTree)
right
else if (left == EmptyTree && right /= EmptyTree)
then Node location cost (insertElement newlocation newcost EmptyTree)
right
else if (left /= EmptyTree && right == EmptyTree)
then Node location cost left
(insertElement newlocation newcost EmptyTree)
else Node newlocation newcost EmptyTree
(Node location cost left right)
buildBPT [] = EmptyTree
--buildBPT (xs:[]) = insertElement (fst xs) (snd xs) (buildBPT [])
buildBPT (x:xs) = insertElement (fst x) (snd x) (buildBPT xs)
createBinaryParkingTree a b = buildBPT (zip a b)
Thank you for any guidance that might be provided. Yes I have looked at some of the similar questions I do think my problem is different but if you think a certain post has a clear answer that will help I am willing to go and take a look at it.
Here's a corecursive solution.
{-# bft(Xs,T) :- bft( Xs, [T|Q], Q). % if you don't read Prolog, see (*)
bft( [], Nodes , []) :- maplist( =(empty), Nodes).
bft( [X|Xs], [N|Nodes], [L,R|Q]) :- N = node(X,L,R),
bft( Xs, Nodes, Q).
#-}
data Tree a = Empty | Node a (Tree a) (Tree a) deriving Show
bft :: [a] -> Tree a
bft xs = head nodes -- Breadth First Tree
where
nodes = zipWith g (map Just xs ++ repeat Nothing) -- values and
-- Empty leaves...
(pairs $ tail nodes) -- branches...
g (Just x) (lt,rt) = Node x lt rt
g Nothing _ = Empty
pairs ~(a: ~(b:c)) = (a,b) : pairs c
{-
nodes!!0 = g (Just (xs!!0)) (nodes!!1, nodes!!2) .
nodes!!1 = g (Just (xs!!1)) (nodes!!3, nodes!!4) . .
nodes!!2 = g (Just (xs!!2)) (nodes!!5, nodes!!6) . . . .
................ .................
-}
nodes is the breadth-first enumeration of all the subtrees of the result tree. The tree itself is the top subtree, i.e., the first in this list. We create Nodes from each x in the input xs, and when the input
is exhausted we create Emptys by using an indefinite number of Nothings instead (the Empty leaves' true length is length xs + 1 but we don't need to care about that).
And we didn't have to count at all.
Testing:
> bft [1..4]
Node 1 (Node 2 (Node 4 Empty Empty) Empty) (Node 3 Empty Empty)
> bft [1..10]
Node 1
(Node 2
(Node 4
(Node 8 Empty Empty)
(Node 9 Empty Empty))
(Node 5
(Node 10 Empty Empty)
Empty))
(Node 3
(Node 6 Empty Empty)
(Node 7 Empty Empty))
How does it work: the key is g's laziness, that it doesn't force lt's nor rt's value, while the tuple structure is readily served by -- very lazy in its own right -- pairs. So both are just like the not-yet-set variables in that Prolog pseudocode(*), when served as 2nd and 3rd arguments to g. But then, for the next x in xs, the node referred to by this lt becomes the next invocation of g's result.
And then it's rt's turn, etc. And when xs end, and we hit the Nothings, g stops pulling the values from pairs's output altogether. So pairs stops advancing on the nodes too, which is thus never finished though it's defined as an unending stream of Emptys past that point, just to be on the safe side.
(*) Prolog's variables are explicitly set-once: they are allowed to be in a not-yet-assigned state. Haskell's (x:xs) is Prolog's [X | Xs].
The pseudocode: maintain a queue; enqueue "unassigned pointer"; for each x in xs: { set pointer in current head of the queue to Node(x, lt, rt) where lt, rt are unassigned pointers; enqueue lt; enqueue rt; pop queue }; set all pointers remaining in queue to Empty; find resulting tree in the original head of the queue, i.e. the original first "unassigned pointer" (or "empty box" instead of "unassigned pointer" is another option).
This Prolog's "queue" is of course fully persistent: "popping" does not mutate any data structure and doesn't change any outstanding references to the queue's former head -- it just advances the current pointer into the queue. So what's left in the wake of all this queuing, is the bfs-enumeration of the built tree's nodes, with the tree itself its head element -- the tree is its top node, with the two children fully instantiated to the bottom leaves by the time the enumeration is done.
Update: #dfeuer came up with much simplified version of it which is much closer to the Prolog original (that one in the comment at the top of the post), that can be much clearer. Look for more efficient code and discussion and stuff in his post. Using the simple [] instead of dfeuer's use of the more efficient infinite stream type data IS a = a :+ IS a for the sub-trees queue, it becomes
bftree :: [a] -> Tree a
bftree xs = t
where
t : q = go xs q
go [] _ = repeat Empty
go (x:ys) ~(l : ~(r : q)) = Node x l r : go ys q
---READ-- ----READ---- ---WRITE---
{-
xs = [ x x2 x3 x4 x5 x6 x7 x8 … ]
(t:q) = [ t l r ll lr rl rr llr … Empty Empty … … ]
-}
For comparison, the opposite operation of breadth-first enumeration of a tree is
bflist :: Tree a -> [a]
bflist t = [x | Node x _ _ <- q]
where
q = t : go 1 q
go 0 _ = []
go i (Empty : q) = go (i-1) q
go i (Node _ l r : q) = l : r : go (i+1) q
-----READ------ --WRITE--
How does bftree work: t : q is the list of the tree's sub-trees in breadth-first order. A particular invocation of go (x:ys) uses l and r before they are defined by subsequent invocations of go, either with another x further down the ys, or by go [] which always returns Empty. The result t is the very first in this list, the topmost node of the tree, i.e. the tree itself.
This list of tree nodes is created by the recursive invocations of go at the same speed with which the input list of values xs is consumed, but is consumed as the input to go at twice that speed, because each node has two child nodes.
These extra nodes thus must also be defined, as Empty leaves. We don't care how many are needed and simply create an infinite list of them to fulfill any need, although the actual number of empty leaves will be one more than there were xs.
This is actually the same scheme as used in computer science for decades for array-backed trees where tree nodes are placed in breadth-first order in a linear array. Curiously, in such setting both conversions are a no-op -- only our interpretation of the same data is what's changing, our handling of it, how are we interacting with / using it.
Update: the below solution is big-O optimal and (I think) pretty easy to understand, so I'm leaving it here in case anyone's interested. However, Will Ness's solution is much more beautiful and, especially when optimized a bit, can be expected to perform better in practice. It is much more worthy of study!
I'm going to ignore the fake edge labels for now and just focus on the core of what's happening.
A common pattern in algorithm design is that it's sometimes easier to solve a more general problem. So instead of trying to build a tree, I'm going to look at how to build a forest (a list of trees) with a given number of trees. I'll make the node labels polymorphic to avoid having to think about what they look like; you can of course use the same building technique with your original tree type.
data Tree a = Empty | Node a (Tree a) (Tree a)
-- Built a tree from a breadth-first list
bft :: [a] -> Tree a
bft xs = case dff 1 xs of
[] -> Empty
[t] -> t
_ -> error "something went wrong"
-- Build a forest of nonempty trees.
-- The given number indicates the (maximum)
-- number of trees to build.
bff :: Int -> [a] -> [Tree a]
bff _ [] = []
bff n xs = case splitAt n xs of
(front, rear) -> combine front (bff (2 * n) rear)
where
combine :: [a] -> [Tree a] -> [Tree a]
-- you write this
Here's a full, industrial-strength, maximally lazy implementation. This is the most efficient version I've been able to come up with that's as lazy as possible. A slight variant is less lazy but still works for fully-defined infinite inputs; I haven't tried to test which would be faster in practice.
bft' :: [a] -> Tree a
bft' xs = case bff 1 xs of
[] -> Empty
[t] -> t
_ -> error "whoops"
bff' :: Int -> [a] -> [Tree a]
bff' !_ [] = []
bff' n xs = combine n xs (bff (2 * n) (drop n xs))
where
-- The "take" portion of the splitAt in the original
-- bff is integrated into this version of combine. That
-- lets us avoid allocating an intermediate list we don't
-- really need.
combine :: Int -> [a] -> [Tree a] -> [Tree a]
combine 0 !_ ~[] = [] -- These two lazy patterns are just documentation
combine _k [] ~[] = []
combine k (y : ys) ts = Node y l r : combine (k - 1) ys dropped
where
(l, ~(r, dropped)) = case ts of -- This lazy pattern matters.
[] -> (Empty, (Empty, []))
t1 : ts' -> (t1, case ts' of
[] -> (Empty, [])
t2 : ts'' -> (t2, ts''))
For the less-lazy variant, replace (!l, ~(!r, dropped)) with (!l, !r, dropped) and adjust the RHS accordingly.
For true industrial strength, forests should be represented using lists strict in their elements:
data SL a = Cons !a (SL a) | Nil
And the pairs in the above (l, ~(r, dropped)) should both be represented using a type like
data LSP a b = LSP !a b
This should avoid some (pretty cheap) run-time checks. More importantly, it makes it easier to see where things are and aren't getting forced.
The method that you appear to have chosen is to build the tree up backwards: from bottom-to-top, right-to-left; starting from the last element of your list. This makes your buildBPT function look nice, but requires your insertElement to be overly complex. To construct a binary tree in a breadth-first fashion this way would require some difficult pivots at every step past the first three.
Adding 8 nodes to the tree would require the following steps (see how the nodes are inserted from last to first):
. 4
6 6
8 7 8 . .
. .
3
7 4 5
8 . 6 7 8 .
6 2
7 8 3 4
5 6 7 8
5
6 7 1
8 . . . 2 3
4 5 6 7
8 . . . . . . .
If, instead, you insert the nodes left-to-right, top-to-bottom, you end up with a much simpler solution, requiring no pivoting, but instead some tree structure introspection. See the insertion order; at all times, the existing values remain where they were:
. 1
2 3
1 4 5 . .
. .
1
1 2 3
2 . 4 5 6 .
1 1
2 3 2 3
4 5 6 7
1
2 3 1
4 . . . 2 3
4 5 6 7
8 . . . . . . .
The insertion step has an asymptotic time complexity on the order of O(n^2) where n is the number of nodes to insert, as you are inserting the nodes one-by-one, and then iterating the nodes already present in the tree.
As we insert left-to-right, the trick is to check whether the left sub-tree is complete:
if it is, and the right sub-tree is not complete, then recurse to the right.
if it is, and the right sub-tree is also complete, then recurse to the left (starting a new row).
if it is not, then recurse to the left.
Here is my (more generic) solution:
data Tree a = Leaf | Node a (Tree a) (Tree a)
deriving (Eq, Show)
main = do
let l1 = ["A1-Gate", "North-Region", "South-Region", "Convention Center",
"Rectorate", "Academic Building1", "Academic Building2"]
let l2 = [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]
print $ treeFromList $ zip l1 l2
mkNode :: a -> Tree a
mkNode x = Node x Leaf Leaf
insertValue :: Tree a -> a -> Tree a
insertValue Leaf y = mkNode y
insertValue (Node x left right) y
| isComplete left && nodeCount left /= nodeCount right = Node x left (insertValue right y)
| otherwise = Node x (insertValue left y) right
where nodeCount Leaf = 0
nodeCount (Node _ left right) = 1 + nodeCount left + nodeCount right
depth Leaf = 0
depth (Node _ left right) = 1 + max (depth left) (depth right)
isComplete n = nodeCount n == 2 ^ (depth n) - 1
treeFromList :: (Show a) => [a] -> Tree a
treeFromList = foldl insertValue Leaf
EDIT: more detailed explanation:
The idea is to remember in what order you insert nodes: left-to-right first, then top-to-bottom. I compressed the different cases in the actual function, but you can expand them into three:
Is the left side complete? If not, then insert to the left side.
Is the right side as complete as the left side, which is complete? If not, then insert to the right side.
Both sides are full, so we start a new level by inserting to the left side.
Because the function fills the nodes up from left-to-right and top-to-bottom, then we always know (it's an invariant) that the left side must fill up before the right side, and that the left side can never be more than one level deeper than the right side (nor can it be shallower than the right side).
By following the growth of the second set of example trees, you can see how the values are inserted following this invariant. This is enough to describe the process recursively, so it extrapolates to a list of any size (the recursion is the magic).
Now, how do we determine whether a tree is 'complete'? Well, it is complete if it is perfectly balanced, or if – visually – its values form a triangle. As we are working with binary trees, then the base of the triangle (when filled) must have a number of values equal to a power of two. More specifically, it must have 2^(depth-1) values. Count for yourself in the examples:
depth = 1 -> base = 1: 2^(1-1) = 1
depth = 2 -> base = 2: 2^(2-1) = 2
depth = 3 -> base = 4: 2^(3-1) = 4
depth = 4 -> base = 8: 2^(4-1) = 8
The total number of nodes above the base is one less than the width of the base: 2^(n-1) - 1. The total number of nodes in the complete tree is therefore the number of nodes above the base, plus those of the base, so:
num nodes in complete tree = 2^(depth-1) - 1 + 2^(depth-1)
= 2 × 2^(depth-1) - 1
= 2^depth - 1
So now we can say that a tree is complete if it has exactly 2^depth - 1 non-empty nodes in it.
Because we go left-to-right, top-to-bottom, when the left side is complete, we move to the right, and when the right side is just as complete as the left side (meaning that it has the same number of nodes, which is means that it is also complete because of the invariant), then we know that the whole tree is complete, and therefore a new row must be added.
I originally had three special cases in there: when both nodes are empty, when the left node is empty (and therefore so was the right) and when the right node is empty (and therefore the left could not be). These three special cases are superseded by the final case with the guards:
If both sides are empty, then countNodes left == countNodes right, so therefore we add another row (to the left).
If the left side is empty, then both sides are empty (see previous point).
If the right side is empty, then the left side must have depth 1 and node count 1, meaning that it is complete, and 1 /= 0, so we add to the right side.
I want to implement a function which deletes any empty children in a tree:
makeUnhollow (Node 5 Empty Empty) => Leaf 5
makeUnhollow (Leaf 5) => Leaf 5
makeUnhollow (Node 5 (Leaf 4) Empty) => (Node 5 (Leaf 4) Empty)
This is my current code:
makeUnhollow :: Tree a -> Tree a
makeUnhollow (Node a Empty Empty)= Leaf a
makeUnhollow (Leaf a) = Leaf a
makeUnhollow a = a
But somehow I'm getting failures for this code:
Tests.hs:130:
wrong result
expected: Node 6 (Leaf 5) (Leaf 7)
but got: Node 6 (Node 5 Empty Empty) (Node 7 Empty Empty)
Deleting all Empty children in a tree seems a little difficult:
Node a Empty Empty can become Leaf a
What if your root node is Empty?
What will Node a Empty (Leaf b) be?
I understand from your test that your goal is just to turn Node a Empty Empty into Leaf a, and not care when only one child is Empty. Mark Seemann's suggestion to turn makeUnhollow into a recursive function means you have to make it call itself in the last case of:
makeUnhollow :: Tree a -> Tree a
makeUnhollow (Node a Empty Empty) = Leaf a
makeUnhollow (Leaf a) = Leaf a
makeUnhollow Empty = ? -- don't forget to match Empty
makeUnhollow (Node a left right) = ? -- recurse on left, right :: Tree a
I might just call the function unhollow since that's an imperative verb, too.
So, this tree is NOT a Binary Search Tree. It is in no particular order, and is just in this order for quick access to specific indices (nth element), rather than whether an element exists or not.
The form of the Tree is like so:
data Tree a = Leaf a | Node Int (Tree a) (Tree a) deriving Show
For this specific tree, the "Int" from the Node constructor is the number of elements underneath that node (or number of leaves).
Using this structure, I copied parts of the Tree functions available in a lecture I found online (that I slightly modified when trying to understand):
buildTree :: [a] -> Tree a
buildTree = growLevel . map Leaf
where
growLevel [node] = node
growLevel l = growLevel $ inner l
inner [] = []
inner (e1:e2:rest) = e1 <> e2 : inner rest
inner xs = xs
join l#(Leaf _) r#(Leaf _) = Node 2 l r
join l#(Node ct _ _) r#(Leaf _) = Node (ct+1) l r
join l#(Leaf _) r#(Node ct _ _) = Node (ct+1) l r
join l#(Node ctl _ _) r#(Node ctr _ _) = Node (ctl+ctr) l r
And I was able to create some basic functions for moving through a tree. I made one that finds the nth element and returns it. I also made a Path datatype and implemented a function to return the path (in left and rights) to a specific index, and one function that can travel through a path and return that Node/Leaf.
Now, what I would like to make is a delete function. The problem here is with the fact that the tree is "leafy", or at least that is what is causing me difficulties.
If I end up with a Leaf at the deletion path, there is no "Null" or equivalent item to replace it with. Additionally, if I try to stop at the last path (like [L]), and check if that's a Node or not, then if it's a leaf replace the whole node with the opposite side etc., I run into the problem of changing the whole tree to reflect that change, not just return the end of the deletion, and change all the numbers from the tree to reflect the change in leaves.
I would like order to be preserved when deleting an item, like if you were to use a list as a simpler example:
del 4 [1, 2, 3, 4, 5, 6, 7] = [1, 2, 3, 4, 6, 7]
If there is a simpler way to structure the Tree (that still can contain duplicate elements and preserve order) what is it?
Is there some way to delete an element using this method?
If I ... replace the whole node with the opposite side ... I run into the problem of changing the whole tree to reflect that change, not just return the end of the deletion, and change all the numbers from the tree to reflect the change in leaves.
Well, not the whole tree - just the path from the deleted node back to the root. And isn't that exactly what you want?
I guess the first step would be, define what you mean by "delete". Should the indexes of undeleted nodes remain the same after deletion, or should nodes after the deleted node have their indexes reduced by one? That is, given:
tree :: [a] -> Tree a
-- get and del both 0-indexed, as in your example
get :: Int -> Tree a -> Maybe a
del :: Int -> Tree a -> Tree a
then of course
get 5 $ tree [1..7]
should yield Just 6. But what about
get 5 . del 4 $ tree [1..7]
? If you want this to still yield Just 6 (there is a "blank" spot in your tree where 5 used to be), that is a rather tricky concept, I think. You can put Nothings in to make space, if you define Leaf (Maybe a) instead of Leaf a, but this only papers over the problem: inserts will still shift indices around.
I think it is much simpler for this to yield Just 7 instead, making del 4 $ tree [1..7] the same as tree [1,2,3,4,6,7]. If this is your goal, then you simply must renumber all the nodes on the path from the deleted node back to the root: there is no getting around the fact that they all have one fewer leaf descendant now. But the other nodes in the tree can remain untouched.
For reference, one possible implementation of del:
count :: Tree a -> Int
count (Leaf _) = 1
count (Node s _ _) = s
del :: Int -> Tree a -> Maybe (Tree a)
del n t | n < 0 || n >= size || size <= 1 = Nothing
| otherwise = go n t
where size = count t
go n (Leaf _) = Nothing
go n (Node s l r) | n < size = reparent flip l r
| otherwise = reparent id r l
where reparent k c o = pure . maybe o (k (Node (s - 1)) o) $ go n c
size = count l
If I end up with a Leaf at the deletion path, there is no "Null" or equivalent item to replace it with.
Well, make one :). This is what Maybe is for: when you delete an element from a Tree, you cannot expect to get a Tree back, because Tree is defined to be nonempty. You need to explicitly add the possibility of emptiness by wrapping in Maybe. Deletion may also fail with an out-of-bounds error, which I represent with Either Int and incorporate into the logic.
delete :: Int -> Tree a -> Either Int (Maybe (Tree a))
delete i t | i >= max = Left (i - max) where max = count t
delete _ (Leaf _) = Right Nothing
delete i (Node n l r) = case delete i l of
Left i' -> Just <$> maybe l (Node (n - 1) l) <$> delete i' r
Right l' -> Right $ Just $ maybe r (\x -> Node (n - 1) x r) l'
Where count is as I recommended in the comments:
count :: Tree a -> Int
count (Leaf _) = 1
count (Node n _ _) = n
So I am trying to add consecutive numbers to the elements in a BST strictly using recursion (no standard prelude functions). Here is what I have so far:
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
leaf x = Node x Empty Empty
number' :: Int -> Tree a -> Tree (Int, a)
number' a Empty = Empty
number' a (Node x xl xr) = Node (a,x) (number' (a+1) xl) (number' (a+1) xr)
number :: Tree a -> Tree (Int, a)
number = number' 1
number' is an auxiliary function that carries around "a" as a counter. It should add 1 to each recursive call, so I am not sure why it is doing what it is doing.
As of now the level of the element is assigned to each element. I would like the first element to be assigned 1, the element to the left of that 2, the element to the left of that 3, etc. Each element should get a+1 assigned to it and no number should be repeated. Thanks in advance.
I want to first explain why the code in the question assigns level numbers. This will lead us directly to two different solutions, one passed on caching, one based on doing two traversals at once. Finally, I show how the second solution relates to the solutions provided by other answers.
What has to be changed in the code from the question?
The code in the question assigns the level number to each node. We can understand why the code behaves like that by looking at the recursive case of the number' function:
number' a (Node x xl xr) = Node (a,x) (number' (a+1) xl) (number' (a+1) xr)
Note that we use the same number, a + 1, for both recursive calls. So the root nodes in both subtrees will get assigned the same number. If we want each node to have a different number, we better pass different numbers to the recursive calls.
What number should we pass to the recursive call?
If we want to assign the numbers according to a left-to-right pre-order traversal, then a + 1 is correct for the recursive call on the left subtree, but not for the recursive call on the right subtree. Instead, we want to leave out enough numbers to annotate the whole left subtree, and then start annotating the right subtree with the next number.
How many numbers do we need to reserve for the left subtree? That depends on the subtree's size, as computed by this function:
size :: Tree a -> Int
size Empty = 0
size (Node _ xl xr) = 1 + size xl + size xr
Back to the recursive case of the number' function. The smallest number annotated somewhere in the left subtree is a + 1. The biggest number annotated somewhere in the left subtree is a + size xl. So the smallest number available for the right subtree is a + size xl + 1. This reasoning leads to the following implementation of the recursive case for number' that works correctly:
number' :: Int -> Tree a -> Tree (Int, a)
number' a Empty = Empty
number' a (Node x xl xr) = Node (a,x) (number' (a+1) xl) (number' (a + size xl + 1) xr)
Unfortunately, there is a problem with this solution: It is unnecessarily slow.
Why is the solution with size slow?
The function size traverses the whole tree. The function number' also traverses the whole tree, and it calls size on all left subtrees. Each of these calls will traverse the whole subtree. So overall, the function size gets executed more than once on the same node, even though it always returns the same value, of course.
How can we avoid traversing the tree when calling size?
I know two solutions: Either we avoid traversing the tree in the implementation of size by caching the sizes of all trees, or we avoid calling size in the first place by numbering the nodes and computing the size in one traversal.
How can we compute the size without traversing the tree?
We cache the size in every tree node:
data Tree a = Empty | Node Int a (Tree a) (Tree a) deriving (Show)
size :: Tree a -> Int
size Empty = 0
size (Node n _ _ _) = n
Note that in the Node case of size, we just return the cached size. So this case is not recursive, and size does not traverse the tree, and the problem with our implementation of number' above goes away.
But the information about the size has to come from somewhere! Everytime we create a Node, we have to provide the correct size to fill the cache. We can lift this task off to smart constructors:
empty :: Tree a
empty = Empty
node :: a -> Tree a -> Tree a -> Tree a
node x xl xr = Node (size xl + size xr + 1) x xl xr
leaf :: a -> Tree a
leaf x = Node 1 x Empty Empty
Only node is really necessary, but I added the other two for completeness. If we always use one of these three functions to create a tree, the cached size information will always be correct.
Here is the version of number' that works with these definitions:
number' :: Int -> Tree a -> Tree (Int, a)
number' a Empty = Empty
number' a (Node _ x xl xr) = node (a,x) (number' (a+1) xl) (number' (a + size xl + 1) xr)
We have to adjust two things: When pattern matching on Node, we ignore the size information. And when creating a Node, we use the smart constructor node.
That works fine, but it has the drawback of having to change the definition of trees. On the one hand, caching the size might be a good idea anyway, but on the other hand, it uses some memory and it forces the trees to be finite. What if we want to implement a fast number' without changing the definition of trees? This brings us to the second solution I promised.
How can we number the tree without computing the size?
We cannot. But we can number the tree and compute the size in a single traversal, avoiding the multiple size calls.
number' :: Int -> Tree a -> (Int, Tree (Int, a))
Already in the type signature, we see that this version of number' computes two pieces of information: The first component of the result tuple is the size of the tree, and the second component is the annotated tree.
number' a Empty = (0, Empty)
number' a (Node x xl xr) = (sl + sr + 1, Node (a, x) yl yr) where
(sl, yl) = number' (a + 1) xl
(sr, yr) = number' (a + sl + 1) xr
The implementation decomposes the tuples from the recursive calls and composes the components of the result. Note that sl is like size xl from the previous solution, and sr is like size xr. We also have to name the annotated subtrees: yl is the left subtree with node numbers, so it is like number' ... xl in the previous solution, and yr is the right subtree with node numbers, so it is like number' ... xr in the previous solution.
We also have to change number to only return the second component of the result of number':
number :: Tree a -> Tree (Int, a)
number = snd . number' 1
I think that in a way, this is the clearest solution.
What else could we improve?
The previous solution works by returning the size of the subtree. That information is then used to compute the next available node number. Instead, we could also return the next available node number directly.
number' a Empty = (a, Empty)
number' a (Node x xl xr) = (ar, Node (a, x) yl yr) where
(al, yl) = number' (a + 1) xl
(ar, yr) = number' al xr
Note that al is like a + sl + 1 in the previous solution, and ar is like a + sl + sr + 1. Clearly, this change avoids some additions.
This is essentially the solution from Sergey's answer, and I would expect that this is the version most Haskellers would write. You could also hide the manipulations of a, al and ar in a state monad, but I don't think that really helps for such a small example. The answer by Ankur shows how it would look like.
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
number :: Tree a -> Tree (Int, a)
number = fst . number' 1
number' :: Int -> Tree a -> (Tree (Int, a), Int)
number' a Empty = (Empty, a)
number' a (Node x l r) = let (l', a') = number' (a + 1) l
(r', a'') = number' a' r
in (Node (a, x) l' r', a'')
*Tr> let t = (Node 10 (Node 20 (Node 30 Empty Empty) (Node 40 Empty Empty)) (Node 50 (Node 60 Empty Empty) Empty))
*Tr> t
Node 10 (Node 20 (Node 30 Empty Empty) (Node 40 Empty Empty)) (Node 50 (Node 60 Empty Empty) Empty)
*Tr> number t
Node (1,10) (Node (2,20) (Node (3,30) Empty Empty) (Node (4,40) Empty Empty)) (Node (5,50) (Node (6,60) Empty Empty) Empty)
As suggested by comments in your question that each call to number should return a integer also which needs to be further used for next set of nodes. This makes the signature of the function to:
Tree a -> Int -> (Tree (Int,a), Int)
Looking at the last part of it, it looks like a candidate for State monad i.e state -> (Val,state).
Below code shows how you can do this using State monad.
import Control.Monad.State
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
myTree :: Tree String
myTree = Node "A" (Node "B" (Node "D" Empty Empty) (Node "E" Empty Empty)) (Node "C" (Node "F" Empty Empty) (Node "G" Empty Empty))
inc :: State Int ()
inc = do
i <- get
put $ i + 1
return ()
number :: Tree a -> State Int (Tree (Int,a))
number Empty = return Empty
number (Node x l r) = do
i <- get
inc
l' <- number l
r' <- number r
return $ Node (i,x) l' r'
main = do
putStrLn $ show (fst (runState (number myTree) 1))
Basically I've made a polymorphic tree data type and I need a way of counting the number of elements in a given tree. Here's the declaration for my Tree data type:
data Tree a = Empty
| Leaf a
| Node (Tree a) a (Tree a)
deriving (Eq, Ord, Show)
So I can define a tree of Ints like this:
t :: Tree Int
t = Node (Leaf 5) 7 (Node (Leaf 2) 3 (Leaf 7))
However, I need a function to count the number of elements in one of these lists. I've defined this recursive function but I get the error "inferred type is not general enough":
size :: Tree a -> Int
size Empty = 0
size (Leaf n) = 1
size (Node x y z) = size x + size y + size z
Is there something here I shouldn't be doing?
I think it's just a typo when you write
size (Node x y z) = size x + size y + size z
which should just be
size (Node x y z) = size x + size z + 1
since y is no subtree but just the element stored.
Or to make it even clearer
size (Node left elem right) = size left + size right + 1
Technically, your error occurs because the term size y does only make sense if y is again a tree whose size can be computed. Therefore the type of this clause would be inferred to Tree (Tree a) -> Int, which is, compared with the actual Tree a -> Int, not general
enough.
Look at you last clause: Looking at the left hand side, at Node x y z, what is the type of y? Does size y make sense?