Find all the partial paths in a tree - haskell

I am trying to create a Haskell function that generates a list containing all the partial paths of a tree, given the tree implementation data Tree a = Empty | Node a (Tree a) (Tree a). For example, if I have a tree
tree = Node 5 (Node 3 Empty Empty ) (Node 2 Empty Empty )
I want to get
[[],[5],[5,3],[5,2]]
How could I make such a function?

First let us consider the type of this function, it must be Tree a -> [[a]].
So what can we do given a node Node x left right? We have the path that is just the node itself - this is just [x], as well as the paths that go through this node to the left- and right sub tree. The paths going throu the left and right sub tree are just what we get if we apply our function to left and right respectively. We now just need to add x to the start of each of those paths and we do that by calling map(x:) paths. (And for an empty one we should get an empty list as there is no path.)
data Tree a = Empty | Node a (Tree a) (Tree a)
tree = Node 5 (Node 3 Empty Empty ) (Node 2 Empty Empty )
pp :: Tree a -> [[a]]
pp (Node x left right) = [[x]] ++ map(x:)(pp left) ++ map(x:)(pp right)
pp Empty = []
Now this has one flaw that the empty path is not considered as a partial path by this function. But we can easily amend that by adding it and wrapping it in another function:
partialPaths :: Tree a -> [[a]]
partialPaths t = [[]] ++ pp t
main = print $ partialPaths tree
Try it online!

Related

How to delete an element from a Leafy Binary Tree (Haskell)

So, this tree is NOT a Binary Search Tree. It is in no particular order, and is just in this order for quick access to specific indices (nth element), rather than whether an element exists or not.
The form of the Tree is like so:
data Tree a = Leaf a | Node Int (Tree a) (Tree a) deriving Show
For this specific tree, the "Int" from the Node constructor is the number of elements underneath that node (or number of leaves).
Using this structure, I copied parts of the Tree functions available in a lecture I found online (that I slightly modified when trying to understand):
buildTree :: [a] -> Tree a
buildTree = growLevel . map Leaf
where
growLevel [node] = node
growLevel l = growLevel $ inner l
inner [] = []
inner (e1:e2:rest) = e1 <> e2 : inner rest
inner xs = xs
join l#(Leaf _) r#(Leaf _) = Node 2 l r
join l#(Node ct _ _) r#(Leaf _) = Node (ct+1) l r
join l#(Leaf _) r#(Node ct _ _) = Node (ct+1) l r
join l#(Node ctl _ _) r#(Node ctr _ _) = Node (ctl+ctr) l r
And I was able to create some basic functions for moving through a tree. I made one that finds the nth element and returns it. I also made a Path datatype and implemented a function to return the path (in left and rights) to a specific index, and one function that can travel through a path and return that Node/Leaf.
Now, what I would like to make is a delete function. The problem here is with the fact that the tree is "leafy", or at least that is what is causing me difficulties.
If I end up with a Leaf at the deletion path, there is no "Null" or equivalent item to replace it with. Additionally, if I try to stop at the last path (like [L]), and check if that's a Node or not, then if it's a leaf replace the whole node with the opposite side etc., I run into the problem of changing the whole tree to reflect that change, not just return the end of the deletion, and change all the numbers from the tree to reflect the change in leaves.
I would like order to be preserved when deleting an item, like if you were to use a list as a simpler example:
del 4 [1, 2, 3, 4, 5, 6, 7] = [1, 2, 3, 4, 6, 7]
If there is a simpler way to structure the Tree (that still can contain duplicate elements and preserve order) what is it?
Is there some way to delete an element using this method?
If I ... replace the whole node with the opposite side ... I run into the problem of changing the whole tree to reflect that change, not just return the end of the deletion, and change all the numbers from the tree to reflect the change in leaves.
Well, not the whole tree - just the path from the deleted node back to the root. And isn't that exactly what you want?
I guess the first step would be, define what you mean by "delete". Should the indexes of undeleted nodes remain the same after deletion, or should nodes after the deleted node have their indexes reduced by one? That is, given:
tree :: [a] -> Tree a
-- get and del both 0-indexed, as in your example
get :: Int -> Tree a -> Maybe a
del :: Int -> Tree a -> Tree a
then of course
get 5 $ tree [1..7]
should yield Just 6. But what about
get 5 . del 4 $ tree [1..7]
? If you want this to still yield Just 6 (there is a "blank" spot in your tree where 5 used to be), that is a rather tricky concept, I think. You can put Nothings in to make space, if you define Leaf (Maybe a) instead of Leaf a, but this only papers over the problem: inserts will still shift indices around.
I think it is much simpler for this to yield Just 7 instead, making del 4 $ tree [1..7] the same as tree [1,2,3,4,6,7]. If this is your goal, then you simply must renumber all the nodes on the path from the deleted node back to the root: there is no getting around the fact that they all have one fewer leaf descendant now. But the other nodes in the tree can remain untouched.
For reference, one possible implementation of del:
count :: Tree a -> Int
count (Leaf _) = 1
count (Node s _ _) = s
del :: Int -> Tree a -> Maybe (Tree a)
del n t | n < 0 || n >= size || size <= 1 = Nothing
| otherwise = go n t
where size = count t
go n (Leaf _) = Nothing
go n (Node s l r) | n < size = reparent flip l r
| otherwise = reparent id r l
where reparent k c o = pure . maybe o (k (Node (s - 1)) o) $ go n c
size = count l
If I end up with a Leaf at the deletion path, there is no "Null" or equivalent item to replace it with.
Well, make one :). This is what Maybe is for: when you delete an element from a Tree, you cannot expect to get a Tree back, because Tree is defined to be nonempty. You need to explicitly add the possibility of emptiness by wrapping in Maybe. Deletion may also fail with an out-of-bounds error, which I represent with Either Int and incorporate into the logic.
delete :: Int -> Tree a -> Either Int (Maybe (Tree a))
delete i t | i >= max = Left (i - max) where max = count t
delete _ (Leaf _) = Right Nothing
delete i (Node n l r) = case delete i l of
Left i' -> Just <$> maybe l (Node (n - 1) l) <$> delete i' r
Right l' -> Right $ Just $ maybe r (\x -> Node (n - 1) x r) l'
Where count is as I recommended in the comments:
count :: Tree a -> Int
count (Leaf _) = 1
count (Node n _ _) = n

Tree to a list of possible paths (values)

Given a tree data structure defined as
data Tree = Node Int Tree Tree | Leaf
How can one transform it into a list of values along all paths?
For example Node 1 (Node 2 Leaf Leaf) (Node 3 Leaf Leaf) should translate to [[1,2], [1,3]].
You have a recursive data structure, so you should expect a recursive solution. The first step in such a solution is to identify the base cases. For this problem, there is the obvious base case: Leaf. We probably also want to treat Node x Leaf Leaf as a base case to avoid duplicate paths.
Now let's write down the type signature. This should make it clear what our base cases should produce.
paths :: Tree -> [[Int]]
For Leaf the sensible thing to do is return a list containing an empty list since a Leaf is representing an empty path.
paths Leaf = [[]]
For Node x Leaf Leaf, we can return a list consisting of a list containing x.
paths (Node x Leaf Leaf) = [[x]]
The next part requires the most thought. We need to consider what to do with the non-base case Node x left right. The strategy here is to assume that we have the result of paths left and paths right and then decide what we need to do with x. We're building paths, so we need to tack x onto the front of all the left paths and the right paths. We also need to combine the two lists into a single list.
paths (Node x left right) = map (x:) (paths left ++ paths right)
And that's it. Of course, now you might want to see if there's a more efficient way to implement it, or if there's a common pattern here instead (i.e. can we write this as a fold?).
The answer here is very simple, so for a Tree structure such as the one you provided, if we have:
data Tree = Node Int Tree Tree | Leaf
We would need a nested function within our function, basically a function which returns a list from a branch, this way we can have the first argument of the Tree (the trunk or initial value) separated from values of either of the branches, but I'm sure there is a more efficient implementation, so here it is:
traverse :: Tree a -> [[Int]] -- Do not use "a" in the return type
traverse Leaf = [[]]
traverse (Node value left right) = [value] ++ treeToList left : [value] ++ treeToList right : []
where treeToList Leaf = []
treeToList (Node a left right) = [a] ++ treeToList left ++ treeToList right
returns
> traverse (Node 2 (Node 3 Leaf Leaf) (Node 4 Leaf Leaf))
> [[2,3],[2,4]]
Be careful with the return type as Tree a denotes that a is polymorphic but the constructor Node takes only values of type Int, denoting the returned list can only be of type Int.

Assign Consecutive Numbers to Elements in a BST

So I am trying to add consecutive numbers to the elements in a BST strictly using recursion (no standard prelude functions). Here is what I have so far:
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
leaf x = Node x Empty Empty
number' :: Int -> Tree a -> Tree (Int, a)
number' a Empty = Empty
number' a (Node x xl xr) = Node (a,x) (number' (a+1) xl) (number' (a+1) xr)
number :: Tree a -> Tree (Int, a)
number = number' 1
number' is an auxiliary function that carries around "a" as a counter. It should add 1 to each recursive call, so I am not sure why it is doing what it is doing.
As of now the level of the element is assigned to each element. I would like the first element to be assigned 1, the element to the left of that 2, the element to the left of that 3, etc. Each element should get a+1 assigned to it and no number should be repeated. Thanks in advance.
I want to first explain why the code in the question assigns level numbers. This will lead us directly to two different solutions, one passed on caching, one based on doing two traversals at once. Finally, I show how the second solution relates to the solutions provided by other answers.
What has to be changed in the code from the question?
The code in the question assigns the level number to each node. We can understand why the code behaves like that by looking at the recursive case of the number' function:
number' a (Node x xl xr) = Node (a,x) (number' (a+1) xl) (number' (a+1) xr)
Note that we use the same number, a + 1, for both recursive calls. So the root nodes in both subtrees will get assigned the same number. If we want each node to have a different number, we better pass different numbers to the recursive calls.
What number should we pass to the recursive call?
If we want to assign the numbers according to a left-to-right pre-order traversal, then a + 1 is correct for the recursive call on the left subtree, but not for the recursive call on the right subtree. Instead, we want to leave out enough numbers to annotate the whole left subtree, and then start annotating the right subtree with the next number.
How many numbers do we need to reserve for the left subtree? That depends on the subtree's size, as computed by this function:
size :: Tree a -> Int
size Empty = 0
size (Node _ xl xr) = 1 + size xl + size xr
Back to the recursive case of the number' function. The smallest number annotated somewhere in the left subtree is a + 1. The biggest number annotated somewhere in the left subtree is a + size xl. So the smallest number available for the right subtree is a + size xl + 1. This reasoning leads to the following implementation of the recursive case for number' that works correctly:
number' :: Int -> Tree a -> Tree (Int, a)
number' a Empty = Empty
number' a (Node x xl xr) = Node (a,x) (number' (a+1) xl) (number' (a + size xl + 1) xr)
Unfortunately, there is a problem with this solution: It is unnecessarily slow.
Why is the solution with size slow?
The function size traverses the whole tree. The function number' also traverses the whole tree, and it calls size on all left subtrees. Each of these calls will traverse the whole subtree. So overall, the function size gets executed more than once on the same node, even though it always returns the same value, of course.
How can we avoid traversing the tree when calling size?
I know two solutions: Either we avoid traversing the tree in the implementation of size by caching the sizes of all trees, or we avoid calling size in the first place by numbering the nodes and computing the size in one traversal.
How can we compute the size without traversing the tree?
We cache the size in every tree node:
data Tree a = Empty | Node Int a (Tree a) (Tree a) deriving (Show)
size :: Tree a -> Int
size Empty = 0
size (Node n _ _ _) = n
Note that in the Node case of size, we just return the cached size. So this case is not recursive, and size does not traverse the tree, and the problem with our implementation of number' above goes away.
But the information about the size has to come from somewhere! Everytime we create a Node, we have to provide the correct size to fill the cache. We can lift this task off to smart constructors:
empty :: Tree a
empty = Empty
node :: a -> Tree a -> Tree a -> Tree a
node x xl xr = Node (size xl + size xr + 1) x xl xr
leaf :: a -> Tree a
leaf x = Node 1 x Empty Empty
Only node is really necessary, but I added the other two for completeness. If we always use one of these three functions to create a tree, the cached size information will always be correct.
Here is the version of number' that works with these definitions:
number' :: Int -> Tree a -> Tree (Int, a)
number' a Empty = Empty
number' a (Node _ x xl xr) = node (a,x) (number' (a+1) xl) (number' (a + size xl + 1) xr)
We have to adjust two things: When pattern matching on Node, we ignore the size information. And when creating a Node, we use the smart constructor node.
That works fine, but it has the drawback of having to change the definition of trees. On the one hand, caching the size might be a good idea anyway, but on the other hand, it uses some memory and it forces the trees to be finite. What if we want to implement a fast number' without changing the definition of trees? This brings us to the second solution I promised.
How can we number the tree without computing the size?
We cannot. But we can number the tree and compute the size in a single traversal, avoiding the multiple size calls.
number' :: Int -> Tree a -> (Int, Tree (Int, a))
Already in the type signature, we see that this version of number' computes two pieces of information: The first component of the result tuple is the size of the tree, and the second component is the annotated tree.
number' a Empty = (0, Empty)
number' a (Node x xl xr) = (sl + sr + 1, Node (a, x) yl yr) where
(sl, yl) = number' (a + 1) xl
(sr, yr) = number' (a + sl + 1) xr
The implementation decomposes the tuples from the recursive calls and composes the components of the result. Note that sl is like size xl from the previous solution, and sr is like size xr. We also have to name the annotated subtrees: yl is the left subtree with node numbers, so it is like number' ... xl in the previous solution, and yr is the right subtree with node numbers, so it is like number' ... xr in the previous solution.
We also have to change number to only return the second component of the result of number':
number :: Tree a -> Tree (Int, a)
number = snd . number' 1
I think that in a way, this is the clearest solution.
What else could we improve?
The previous solution works by returning the size of the subtree. That information is then used to compute the next available node number. Instead, we could also return the next available node number directly.
number' a Empty = (a, Empty)
number' a (Node x xl xr) = (ar, Node (a, x) yl yr) where
(al, yl) = number' (a + 1) xl
(ar, yr) = number' al xr
Note that al is like a + sl + 1 in the previous solution, and ar is like a + sl + sr + 1. Clearly, this change avoids some additions.
This is essentially the solution from Sergey's answer, and I would expect that this is the version most Haskellers would write. You could also hide the manipulations of a, al and ar in a state monad, but I don't think that really helps for such a small example. The answer by Ankur shows how it would look like.
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
number :: Tree a -> Tree (Int, a)
number = fst . number' 1
number' :: Int -> Tree a -> (Tree (Int, a), Int)
number' a Empty = (Empty, a)
number' a (Node x l r) = let (l', a') = number' (a + 1) l
(r', a'') = number' a' r
in (Node (a, x) l' r', a'')
*Tr> let t = (Node 10 (Node 20 (Node 30 Empty Empty) (Node 40 Empty Empty)) (Node 50 (Node 60 Empty Empty) Empty))
*Tr> t
Node 10 (Node 20 (Node 30 Empty Empty) (Node 40 Empty Empty)) (Node 50 (Node 60 Empty Empty) Empty)
*Tr> number t
Node (1,10) (Node (2,20) (Node (3,30) Empty Empty) (Node (4,40) Empty Empty)) (Node (5,50) (Node (6,60) Empty Empty) Empty)
As suggested by comments in your question that each call to number should return a integer also which needs to be further used for next set of nodes. This makes the signature of the function to:
Tree a -> Int -> (Tree (Int,a), Int)
Looking at the last part of it, it looks like a candidate for State monad i.e state -> (Val,state).
Below code shows how you can do this using State monad.
import Control.Monad.State
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
myTree :: Tree String
myTree = Node "A" (Node "B" (Node "D" Empty Empty) (Node "E" Empty Empty)) (Node "C" (Node "F" Empty Empty) (Node "G" Empty Empty))
inc :: State Int ()
inc = do
i <- get
put $ i + 1
return ()
number :: Tree a -> State Int (Tree (Int,a))
number Empty = return Empty
number (Node x l r) = do
i <- get
inc
l' <- number l
r' <- number r
return $ Node (i,x) l' r'
main = do
putStrLn $ show (fst (runState (number myTree) 1))

Non-exhaustive patterns in function internals

I am currently working on Problem 62
I have tried the following code to solve it:
data Tree a = Empty | Branch a (Tree a) (Tree a)
deriving (Show, Eq)
internals :: Tree a -> [a]
internals (Branch a Empty Empty) = []
internals (Branch a b c) = [a]++(internals b)++(internals c)
internals (Branch a b Empty) = [a]++(internals b)
internals (Branch a Empty c) = [a]++(internals c)
Which basically says:
If both the children are empty don't include that list element in the list of internals.
If both children are non-empty, that node (a) is an internal include it, and keep checking to see any of a's children are also internal.
If one of the children is non-empty, that node is internal, and recursively keep checking if the child is also an internal node.
In GHCi I have ran the following:
> let tree4 = Branch 1 (Branch 2 Empty (Branch 4 Empty Empty)) (Branch 2 Empty Empty)
> internals tree4
and get the following runtime error:
[1,2*** Exception: Untitled.hs:(6,1)-(12,49): Non-exhaustive patterns in function internals
I don't understand why this thing is non-exhaustive, I thought it would go to branch 1, notice it's children are non-empty, then go down both branch 2s and find out one branch is empty, one is not, stop at the one that is, and keep going down the one that isn't, until branch "4", and end it there. It sort of does, I do get 1, 2 in the list, but why is it not exhaustive?
Thanks in advanced.
Thank you for the help Tikhon changed my function to this:
data Tree a = Empty | Branch a (Tree a) (Tree a)
deriving (Show, Eq)
internals :: Tree a -> [a]
internals (Branch a Empty Empty) = []
internals (Branch a b Empty) = [a]++(internals b)
internals (Branch a Empty c) = [a]++(internals c)
internals (Branch a b c) = [a]++(internals b)++(internals c)
The other answer doesn't actually solve the reason for the error message, it does resolve one problem though (the fact that the order of patterns is significant).
The error message is Non-exhaustive patterns, which means that internals is being called with a value that doesn't match any of the patterns (this value is Empty). As Tikhon said, it is because Branch a b c matches all Branches, so the later patterns are never used and an Empty can slip through. We can see what happens if we trace the execution of internals (Branch 1 (Branch 2 Empty Empty) Empty) (assume strict-ish evaluation, it makes the exposition simpler):
internals (Branch 1 (Branch 2 Empty Empty) Empty) =>
[1] ++ internals (Branch 2 Empty Empty) ++ internals Empty =>
[1] ++ [] ++ internals Empty =>
[1] ++ internals Empty =>
[1] ++ ???
The proper fix will mean that can't happen, i.e. one that converts internals from a partial function (undefined for some input values) to a total function (defined for all input). Total functions are much much nicer than partial ones, especially in Haskell, where the type system gives the programmer the ability to mark "partial" functions as such at compile time (e.g. via Maybe or Either).
We can think about the recursion from the bottom-up, i.e. work out the base cases:
the empty tree has no internal nodes
a tree that is a single node has no internal nodes
We recur on any tree that doesn't satisfy either of these; in which case, the current node is an internal node (so add that to the list), and there might be internal nodes in the children, so check them too.
We can express this in Haskell:
internals :: Tree a -> [a]
internals Empty = []
internals (Branch a Empty Empty) = []
internals (Branch a b c) = [a] ++ internals b ++ internals c
This has the added bonus of making the code neater and shorter: we don't have to worry about the details of the children in the recursion, there is a base case that handles any Emptys.
The order of patterns matters. Since Branch a b c matches everything that isn't just Empty, including something like Branch a b Empty, your third and fourth cases never get hit.
This should fix it:
internals :: Tree a -> [a]
internals (Branch a Empty Empty) = []
internals (Branch a b Empty) = [a] ++ internals b
internals (Branch a Empty c) = [a] ++ internals c
internals (Branch a b c) = [a] ++ internals b ++ internals c

Writing a function to compute the left spine of a tree

I'm needing some help in figuring out how to create a leftSpine function in Haskell.
Basically, it is supposed to take all the left most leafs and put them into a list, but whenever I run my code I get an empty list. Any help would be appreciated.
Here is my code.
data Tree x = Leaf | Node (Tree x) x (Tree x)
deriving Show
leftSpine :: Tree x -> [x]
leftSpine Leaf = []
leftSpine (Node lt x rt) = (leftSpine lt)
And here is my code to test it.
leftSpine (Node (Node (Node Leaf 1 Leaf) 2 (Node Leaf 3 Leaf))
4
(Node (Node Leaf 5 Leaf) 6 (Node Leaf 7 Leaf)))
It should equal [4,2,1] but it just comes up as [].
leftSpine :: Tree x -> [x]
leftSpine Leaf = []
leftSpine (Node lt x rt) = x:leftSpine lt
You weren't actually putting anything into the list. The difference is the x:leftSpine lt instead of leftSpine lt.
Your code plainly says that the result is always the empty list.
The first case says that the left spine of a leaf is the empty list. Ok so far.
The second case says that the left spine of a node is exactly the left spine of the node's left child.
So if we want to find the left spine of a tree, we'll just keep chasing down the left children of the nodes we reach, knowing that the answer is exactly equal to the left spine of the next left child. Either we eventually find a leaf, and our result is the empty list, or the tree is infinite and we never return a result at all. There is nothing in your code that could ever return anything else.
The key to writing a recursive function of this kind is to figure out what the answer is for the base case (Leaf, here), and then for the non-base cases (Node, here) you need to figure out how to combine the sub-solutions with the local information here to generate a full solution.
In this case the left spine of a Leaf is easy, as there's no data at all. So then how would you combine the information from Node lt x rt with leftSpine lt to get the left spine of the whole tree?

Resources