BST: how to define `insert` in terms of catamorphic fold? - haskell

I have a typical binary search tree data type:
data Tree a
= Empty
| Branch a (Tree a) (Tree a) deriving Show
and a catamorphism
foldt :: b -> (a -> b -> b -> b) -> Tree a -> b
foldt empty _ Empty = empty
foldt empty branch (Branch a l r) = branch a (foldt empty branch l) (foldt empty branch r)
I tried to define an insert function using foldt and got some interesting results:
insert :: (Ord a) => a -> Tree a -> Tree a
insert x = foldt (single x) insertb
where insertb a left right
| x == a = Branch x left right
| x < a = Branch a (insert x left) right
| x > a = Branch a left (insert x right)
ghci> mytree = insert 2 (Branch 3 Empty Empty)
ghci> mytree
Branch 3 (Branch 2 (Branch 2 Empty Empty) (Branch 2 Empty Empty)) (Branch 2 Empty Empty)
ghci>
Of course, a traditional insert method behaves as expected:
insert' :: (Ord a) => a -> Tree a -> Tree a
insert' x Empty = single x
insert' x (Branch a left right)
| x == a = Branch x left right
| x < a = Branch a (insert' x left) right
| x > a = Branch a left (insert' x right)
ghci> mytree2 = insert' 2 (Branch 3 Empty Empty)
ghci> mytree2
Branch 3 (Branch 2 Empty Empty) Empty
ghci>
Is there a way to define insert in terms of foldt, or am I barking up the wrong tree (ha) here?

Let's define a function
insertMaybe :: Ord a => Tree a -> Maybe a -> Tree a
This function takes a tree, and maybe an element. In the Just case, the element is inserted. In the Nothing case, the tree is returned unchanged. So then we can define
insert a t = insertMaybe t (Just a)
Now:
insertMaybe :: Ord a => Tree a -> Maybe a -> Tree a
insertMaybe = foldt leaf branch
where
leaf (Just new) = ?
leaf Nothing = ?
branch a l r Nothing = ?
branch a l r (Just new)
| ... = ?
...
Alternatively:
data Ins a = Ins
{ inserted :: Tree a
, notInserted :: Tree a }
insert a t = inserted (insertAndNot a t)
-- Return the tree with the
-- element inserted, and also unchanged.
insertAndNot :: Ord a => a -> Tree a -> Ins a
insertAndNot new = foldt leaf branch
where
leaf = Ins ? ?
branch a ~(Ins li lni) ~(Ins ri rni)
| ... = Ins ? ?
...
Paramorphism
The above solutions have a major efficiency problem: they completely rebuild the tree structure just to insert an element. As amalloy suggested, we can fix that by replacing foldt (a catamorphism) by parat (a paramorphism). parat gives the branch function access to both the recursively modified and the unmodified subtrees.
parat :: b -> (a -> (Tree a, b) -> (Tree a, b) -> b) -> Tree a -> b
parat leaf _branch Empty = leaf
parat leaf branch (Branch a l r) =
branch a
(l, parat leaf branch l)
(r, parat leaf branch r)
Conveniently, it's also slightly easier to define insert using parat. Can you see how? This ends up being an efficient version of the "alternative" way I suggested for using foldt.

Thanks to dfeuer and amalloy for the tips on paramorphisms, TIL!
Given a paramorphism for the Tree data type:
parat :: b -> (a -> (Tree a, b) -> (Tree a, b) -> b) -> Tree a -> b
parat empty _ Empty = empty
parat empty branch (Branch a l r) =
branch a
(l, parat leaf branch l)
(r, parat leaf branch r)
we can write an insert function as:
insert :: Ord a => a -> Tree a -> Tree a
insert x = parat (single x) branch
where branch a (l, l') (r, r')
| x == a = Branch x l r
| x < a = Branch a l' r
| x > a = Branch a l r'
ghci> mytree = insert 2 (Branch 3 Empty Empty)
ghci> mytree
Branch 3 (Branch 2 Empty Empty) Empty
ghci>
testing a bigger tree...
import Data.Function
mytree :: Tree Integer
mytree = (Branch 3 Empty Empty) & insert 2 & insert 4 & insert 6 & insert 5 & insert 10
inorder :: Tree a -> [a]
inorder = foldt [] (\a l r -> l ++ [a] ++ r)
ghci> mytree
Branch 3 (Branch 2 Empty Empty) (Branch 4 Empty (Branch 6 (Branch 5 Empty Empty) (Branch 10 Empty Empty)))
ghci> inorder mytree
[2,3,4,5,6,10]
ghci>

Related

Return the Branch in a Data Tree Haskell

I am trying to return the branches in a Data Tree. Currently my code only returns the first branch.
data DTree a = Leaf a | Branch a (DTree a) (DTree a) deriving (Show)
get_b :: DTree Int -> [Int]
get_branches (Branch x _ _) = [x]
get_branches (Branch _ l r) = get_branches r ++ get_branches l
Example output
ghci > get_branches (Branch 2 (Leaf 3) (Branch 1 (Leaf 9) (Leaf 5)))
[2]
Perhaps this is something youre looking for?
get_branches :: DTree Int -> [Int]
get_branches (Branch x l r) = [x] ++ get_branches r ++ get_branches l
get_branches (Leaf x) = [x]
There is no base case for your function. So use the pattern 'Leaf'.
Additionally, the patterns (Branch x _ _) (Branch _ l r) are the same pattern, except with (Branch _ l r) you get the left and right branch.
Haskell will perform pattern matching from top to bottom. Since a Branch &hellip … … is matched by the first clause it will never trigger the second clause. Furthermore your program does not cover the Leaf case.
If you want to construct a list of values in the inodes (the Branches), then we can work with:
get_branches :: DTree a -> [a]
get_branches (Leaf _) = []
get_branches (Branch x l r) = x : get_branches r ++ get_branches l
here we first return the value of the branch, and then the values of the branches of the subtrees (first the left subtree and then the right subtree). For example with:
get_branches :: DTree a -> [a]
get_branches (Leaf _) = []
get_branches (Branch x l r) = get_branches r ++ x : get_branches l
this will result in a lot of appending, which is linear in the size of the first list. We can prevent this by making calls where we pass a recursive call as tail, so:
get_branches :: DTree a -> [a]
get_branches = go []
where go tl (Leaf _) = tl
go tl (Branch x l r) = go (x : go tl r) l

Haskell binary trees traversal order with fold

I defined my own Data Type BinTree, which describes my binary trees:
data BinTree a = Empty | Node a (BinTree a) (BinTree a) deriving (Show,Eq)
After that I implemented three sort-functions for the binary trees: preorder, inorder and postorder:
preorder :: BinTree a -> [a]
preorder Empty = []
preorder (Node x lt rt) = [x] ++ preorder lt ++ preorder rt
inorder :: BinTree a -> [a]
inorder Empty = []
inorder (Node x lt rt) = inorder lt ++ [x] ++ inorder rt
postorder :: BinTree a -> [a]
postorder Empty = []
postorder (Node x lt rt) = postorder lt ++ postorder rt ++ [x]
To improve my order-functions, I implemented the foldTree function (which works as a normal foldr function, but with binary trees):
foldTree :: (a -> b -> b -> b) -> b -> BinTree -> b
foldTree f e Empty = e
foldTree f e (Node x lt rt) = f x (foldTree f e lt) (foldTree f e rt)
And now I got stuck, because I cant't figure out how to combine the order-functions with the foldTree.
Can someone give me a hint please?
If by "combine" you mean implement each of the three functions using the last one, then this my recent answer seems to be of use, e.g.
preorder t = foldTree (\a l r -> (a :) . l . r) id t []
inorder t = foldTree (\a l r -> l . (a :) . r) id t []
postorder t = foldTree (\a l r -> l . r . (a :)) id t []
Trying it out:
> t = Node 1 (Node 2 Empty (Node 3 Empty Empty)) (Node 4 (Node 5 Empty Empty) Empty)
{-
1
2 4
. 3 5 .
. . . .
-}
> inorder t
[2,3,1,5,4]
> preorder t
[1,2,3,4,5]
> postorder t
[3,2,5,4,1]
The correct type of your function is, of course,
foldTree :: (a -> b -> b -> b) -> b -> BinTree a -> b
-- ^^^

Trying a version of takeWhile on trees in Haskell

Given a tree that looks like this:
data Tree a = Leaf | Node (Tree a) a (Tree a)
And a fold function that looks like this:
foldTree :: (b -> a -> b -> b) -> b -> Tree a -> b
foldTree _ b Leaf = b
foldTree fn b (Node lt x rt) = f (foldTree fn b lt) x (foldTree fn b rt)
I want to be able to write a takeWhileTree function that looks like this:
treeTakeWhile :: (a -> Bool) -> Tree a -> Tree a
I want it to mimic the 'normal' list takeWhile function so that it returns the largest possible tree whose elements satisfy the given condition.
So, if a tree t = Node (Node Leaf 10 Leaf) 4 (Node Leaf 5 Leaf), then:
treeTakeWhile (> 5) T = Leaf
treeTakeWhile (>= 4) T = T
treeTakeWhile (< 5) T = Node Leaf 4 Leaf
treeTakeWHile (< 8) T = Node Leaf 4 (Node Leaf 5 Leaf)
So far I seem to not be able to formulate what to pass into foldTree.
In the definition of foldtree, the function can be broken down as: b probably being the left subtree, a probably being the value in the current node and b probably being the right subtree.
Therefore, the function passed to treeTakeWhile has to be applied to all these parts of the node in some manner while being able to stop when the operation no longer applies.
treeTakeWhile fn = foldTree (\xs x ys -> if y then g else Leaf) Node()
where g = (lt, x, rt)
y = fn (Node lt x rt)
The above is clearly wrong but I am not sure how to express the act of applying the function to the current node's value followed by the left tree and the right tree here. Could somebody point me in the right direction? And how will the fold be able to produce the required tree?
Edit 1:
Okay, based on your feedback, I have gotten to a place where I think I am pretty close to the answer but cannot figure out why the compiler still complains:
treeTakeWhile :: (a -> Bool) -> Tree a -> Tree a
treeTakeWhile c = foldTree f acc
where acc = Leaf
f l x r = if c x then Node (f lt) x (f rt) else Leaf
As far as I can tell, foldTree is being passed the right arguments now. And the predicate is also being evaluated as required at each level of the tree. The return value is always of type Tree as well.
Instead of using foldTree immediately, let's first aim to define the function itself.
Basically there are three options here:
the tree is a Leaf, regardless what the condition is, the result is a Leaf as well;
the tree is a Node and the condition is satsified, then we yield the element, and recurse on the subtrees;
the tree is a Node and the condition is not satisfied, then the result is a Leaf.
We can encode these rules as:
treeTakeWhile :: (a -> Bool) -> Tree a -> Tree a
treeTakeWhile c = go
where go Leaf = Leaf -- (1)
go (Node l x r) | c x = Node (go l) x (go r) -- (2)
| otherwise = Leaf -- (3)
this then yields the expected results:
Prelude> treeTakeWhile (>5) t
Leaf
Prelude> treeTakeWhile (>=4) t
Node (Node Leaf 10 Leaf) 4 (Node Leaf 5 Leaf)
Prelude> treeTakeWhile (<5) t
Node Leaf 4 Leaf
Prelude> treeTakeWhile (<8) t
Node Leaf 4 (Node Leaf 5 Leaf)
Moving this to a foldTree
Now we aim to move the logic to a foldTree, we can thus write the function as:
treeTakeWhile :: (a -> Bool) -> Tree a -> Tree a
treeTakeWhile c = foldTree f x0
where f tl x tr = -- ...
x0 = -- ...
The x0 is the value we should fill in for Leafs, but we already know what that is: it is the first rule (1) and we should thus return a Leaf as well.
For f we need a function Tree a -> a -> Tree a -> Tree a. The first operand tl is the treeTakeWhile of the left subtree (this would be equivalent to go l in the original function implementation), the second parameter x is the value encoded in the Node, and the last parameter tr is the result of treeTakeWhile on the second subtree (so equivalent to go r), so:
treeTakeWhile :: (a -> Bool) -> Tree a -> Tree a
treeTakeWhile c = foldTree f x0
where f tl x tr = -- ...
x0 = -- ...
(leave this as an exercise).

How to lift a fold|map|both into the Either monad?

A "map" of type
mapIsh :: Traversable t => (a -> Either b c) -> t a -> Either b (t c)
would be a start. (Hayoo doesn't find one.) Or a "fold" of type
foldIsh :: (b -> a -> Either l b) -> b -> t a -> Either l b
Best of all (for my case) would be this:
mapAccumIsh :: (a -> b -> Either l (a, c)) -> a -> t b -> Either l (a, t c)
That might be all you need to read. In case you want more details, though, here's a concrete example:
Imagine a treelike structure to mapAccum over. Each branch, after evaluating its children, gets transformed by some function of its children and the accumulator.
Here's some working code that adds each Tree's value to the accumulator, and also adds to each Branch's label the product of its childrens' labels:
module Temp where
import Data.List
data Tree = Leaf Float | Branch Float [Tree] deriving (Show)
label :: Tree -> Float
label (Leaf i) = i
label (Branch i _) = i
f :: Float -> Tree -> (Float, Tree)
f i (Leaf j) = (i+j, Leaf j)
f i (Branch j ts) = (i + tf, Branch tf ts2) where
(i2, ts2) = mapAccumL f i ts
tf = j + (foldl (*) 1 $ map label ts2)
-- the problem: what if instead of (*) in the line above, we used this?
evenMult :: Float -> Float -> Either String Float
evenMult a b = case even $ round b of True -> Right $ a * b
False -> Left "that's odd"
go = f 0 $ Branch 2 [Branch 2 [Leaf 2]
,Branch 2 [Leaf 2, Leaf (-2)]]
Here's what that returns:
(-6.0,Branch (-6.0) [Branch 4.0 [Leaf 2.0]
,Branch (-2.0) [Leaf 2.0,Leaf (-2.0)]])
But what if, instead of using (*) in the foldl, we used evenMult?

How do you represent nested types using the Scott Encoding?

An ADT can be represented using the Scott Encoding by replacing products by tuples and sums by matchers. For example:
data List a = Cons a (List a) | Nil
Can be encoded using the Scott Encoding as:
cons = (λ h t c n . c h t)
nil = (λ c n . n)
But I couldn't find how nested types can be encoded using SE:
data Tree a = Node (List (Tree a)) | Leaf a
How can it be done?
If the Wikipedia article is correct, then
data Tree a = Node (List (Tree a)) | Leaf a
has Scott encoding
node = λ a . λ node leaf . node a
leaf = λ a . λ node leaf . leaf a
It looks like the Scott encoding is indifferent to (nested) types. All it's concerned with is delivering the correct number of parameters to the constructors.
Scott encodings are basically representing a T by the type of its case expression. So for lists, we would define a case expression like so:
listCase :: List a -> r -> (a -> List a -> r) -> r
listCase [] n c = n
listCase (x:xs) n c = c x xs
this gives us an analogy like so:
case xs of { [] -> n ; (x:xs) -> c }
=
listCase xs n (\x xs -> c)
This gives a type
newtype List a = List { listCase :: r -> (a -> List a -> r) -> r }
The constructors are just the values that pick the appropriate branches:
nil :: List a
nil = List $ \n c -> n
cons :: a -> List a -> List a
cons x xs = List $ \n c -> c x xs
We can work backwards then, from a boring case expression, to the case function, to the type, for your trees:
case t of { Leaf x -> l ; Node xs -> n }
which should be roughly like
treeCase t (\x -> l) (\xs -> n)
So we get
treeCase :: Tree a -> (a -> r) -> (List (Tree a) -> r) -> r
treeCase (Leaf x) l n = l x
treeCase (Node xs) l n = n xs
newtype Tree a = Tree { treeCase :: (a -> r) -> (List (Tree a) -> r) -> r }
leaf :: a -> Tree a
leaf x = Tree $ \l n -> l x
node :: List (Tree a) -> Tree a
node xs = Tree $ \l n -> n xs
Scott encodings are very easy tho, because they're only case. Church encodings are folds, which are notoriously hard for nested types.

Resources