Searching a Value in Binary tree haskell - haskell

I have just started learning Haskell and I am trying to write a code for searching for a particular value in a binary tree and if present return true else false
This is how my tree structure looks like
data Tree = Leaf Int | Node Tree Int Tree
I am not sure how to proceed with the function to traverse through the tree and return the value. I did try BFS and DFS but I am not sure on how to return once I have got my value.
An example of how my function should look
Search 5 (Node (Node (Leaf 1) 3 (Leaf 4)) 5 (Node (Leaf 6) 7 (Leaf 9)))

A binary search could be written as follows. The type can be more generic, as we only need the items to be orderable to store / search in a binary tree.
We visit each node and either return true, or search in 1 of the child nodes.
example Node
5
/ \
3 7
lets search for 7.
We first visit the root. since 5 != 7, we test a child node. Since 7 > 5, we search in the right node, since 7 cannot appear in the left child (all values guaranteed to be lower than 5 on the left child)
If we reach a leaf, we just check if it contains the search term.
search :: Ord a => a -> BinaryTree a -> Bool
search a (Leaf b) = compare a b == EQ
search a (Node left b right)
case compare a b of
EQ -> True
LT -> search a left
GT -> search a right

I am not sure how to proceed with the function to traverse through the tree and return the value.
From that sentence, I understand you would have no problem writing a traversal yourself, but that there is a mental leap you need to take to understand how Haskell works.
You see, you never return anything in Haskell. Returning is fundamentally an imperative statement. Haskell is a declarative language, which means that writing programs is done by stating facts. That nuance can be discomforting, especially if you've been introduced to programming through learning imperative languages like C, Java, JavaScript, etc. But once you truly understand it, you will see how much more expressive and easy declarative programming is.
Because of its strong mathematical roots, in Haskell facts are stated in the form of equations, i.e. expressions where the = sign literally means the left- and right-hand side are equal (whereas in an imperative language, it would probably mean that you assign a value to a variable -- that does not exist in Haskell).
The program #Haleemur Ali wrote is in 1:1 correspondence with how you would write search using math notation:
search(x, t) = { x == y if t = Leaf y
, true if t = Node l y r and x == y
, search(x, l) if t = Node l y r and x < y
, search(x, r) if t = Node l y r and x > y
}
Indeed many times, at least as a beginner, writing Haskell is just a matter of translation, from math notation to Haskell notation. Another interpretation of Haskell programs is as proofs of theorems. Your search is a theorem saying that "if you have a tree and an integer, you can always tell if the integer is somewhere inside the tree". That's what you are telling the compiler when you write a function signature:
search :: Int -> Tree -> Bool
The compiler will only be happy if you write a proof for that theorem ... you probably guessed that the algorithm above is the proof.
An interesting observation is that the algorithm is almost dictated by the shape of the data type. Imagine you wanted to sum all the values in a tree instead:
sum(t) = { x if t = Leaf x
, x + sum(l) + sum(r) if t = Node l x r
}
Every time you want to write an algorithm over a binary tree, you will write something like the above. That is fairly mechanical and repetitive. What if later on you expand your program to deal with rose trees? Tries? You don't want to write the same algorithms and take the risk of making a mistake. One would try to come up with a function that walks down a tree and combines its values (using Haskell notation from now on):
walk :: (Int -> b) -> (b -> b -> b) -> Tree -> b
walk f g (Leaf x) = f x
walk f g (Node l x r) =
let a = walk f g l
b = walk f g r
in g (g (f x) a) b
With this function alone, you can write all manners of traversals on trees:
sum t = walk id (+) t
search x t = walk (== x) (||) t
walk is such a recurring pattern that it has been abstracted. All the data structures that expose the same pattern of recursion are said to be foldable, and the implementation is often so obvious that you can ask the compiler to write it for you, like so:
{-# LANGUAGE DeriveFoldable #-}
data Tree a = Leaf a | Node (Tree a) a (Tree a) deriving (Foldable)
There's even a definition of sum for any foldable data structure.

Related

Attempting to construct trees in Haskell

I am trying to use an unfold function to build trees.
Tree t = Leaf | Node (Tree t) t (Tree t)
unfoldT :: (b -> Maybe (b, a, b)) -> b -> Tree a
unfoldT f b =
case f b of
Nothing -> Leaf
Just (lt, x, rt) -> Node (unfoldT f lt) x (unfoldT f rt)
The build function needs to create a tree that has a height equal to the number provided, as well as be numbered in an in-order fashion. The base case being build 0 = Leaf and the next being build 1 = Node (Leaf 0 Leaf).
build :: Integer -> Tree Integer
My attempt at solving it:
build n = unfoldT (\x -> Just x) [0..2^n-2]
I am not entirely sure how to go about constructing the tree here.
Would love it if somebody could point me in the right direction.
Edit 1:
If I was to use a 2-tuple, what would I combine? I need to be able to refer to the current node, its left subtree and its right subtree somehow right?
If I was to use a 2-tuple, what would I combine?
I would recommend to pass the remaining depth as well as the offset from the left:
build = unfoldT level . (0,)
where
level (_, 0) = Nothing
level (o, n) = let mid = 2^(n-1)
in ((o, n-1), o+mid-1, (o+mid, n-1))
If I was to use a 2-tuple, what would I combine?
That's the key question behind the state-passing paradigm in functional programming, expressed also with the State Monad. We won't be dealing with the latter here, but maybe use the former.
But before that, do we really need to generate all the numbers in a list, and then work off that list? Don't we know in advance what are the numbers we'll be working with?
Of course we do, because the tree we're building is totally balanced and fully populated.
So if we have a function like
-- build2 (depth, startNum)
build2 :: (Int, Int) -> Tree Int
we can use it just the same to construct both halves of e.g. the build [0..14] tree:
build [0..14] == build2 (4,0) == Node (build2 (3,0)) 7 (build2 (3,8))
Right?
But if we didn't want to mess with the direct calculations of all the numbers involved, we could arrange for the aforementioned state-passing, with the twist to build2's interface:
-- depth, startNum tree, nextNum
build3 :: (Int, Int) -> (Tree Int, Int)
and use it like
build :: Int -> Tree Int -- correct!
build depth = build3 (depth, 0) -- intentionally incorrect
build3 :: (Int, Int) -> (Tree Int, Int) -- correct!
build3 (depth, start) = Node lt n rt -- intentionally incorrect
where
(lt, n) = build3 (depth-1, start) -- n is returned
(rt, m) = build3 (depth-1, n+1) -- and used, next
You will need to tweak the above to make all the pieces fit together (follow the types!), implementing the missing pieces of course and taking care of the corner / base cases.
Formulating this as an unfold would be the next step.

Haskell binary tree max int?

I'm trying to write a haskell function that will return the max int inside a binary tree of integers. My binary tree is defined as follows:
data Tree = Node Int Tree Tree | Leaf Int
deriving (Eq,Show)
The way I understand it this declaration is saying that for the 'Tree' data type, it can either be a single leaf int, or be a subtree containing two more trees.
So my maxInt function will look something like this ( I think )
maxInt :: Tree -> Int --maxInt function receives Tree, returns int
maxInt --something to detect if the Tree received is empty
--if only one node, return that int
--look through all nodes, find largest
and so when my function is given something like
maxInt (Node 5 (Leaf 7) (Leaf 2)) , the correct value for maxInt to return would be 7.
I'm new to haskell and don't really know where to start with this problem, I would really appreciate some guidance. Thank you
Let me start it for you:
maxInt :: Tree -> Int
maxInt (Leaf x) = ?
maxInt (Node x l r) = ?
You may find it helpful to use the standard function max, which takes two arguments and returns their maximum:
max 3 17 = 17
To begin with, we have this datatype:
data Tree = Node Int Tree Tree | Leaf Int
deriving (Eq,Show)
That means, we have two constructors for things of type Tree: either we have a Leaf with a single Int value, or we have a Node which allows us to represent bigger trees in a recursive fashion.
So, for example we can have these trees:
Leaf 0
And more complex ones:
Node 3 (Leaf 0) (Leaf 4)
Recall that this tree representation have information both in the leaves and in the nodes, so for our function we will need to take that into account.
You guessed correctly the type of the function maxInt, so you are halfway through!
In order to define this function, given we have a custom defined datatype, we can be confident in using pattern-matching.
Pattern-matching is, putting it simple, a way to define our functions by equations described by, on the left side, one element of our datatype (either Leaf or Node, in our case) and on the right side, the result value. I'd recommend you to learn more about pattern-matching here: pattern matching in Haskell
Hence, we start our function by its type, as you correctly guessed:
maxInt :: Tree -> Int
As we have seen earlier, we will use pattern-matching for this. What would be the first equation, that is, the first pattern-matching case for our function? The simplest tree we have given our datatype is Leaf value. So we start with:
maxInt (Leaf n) = n
Why n as a result? Because we don't have any other value than n in the tree and therefore it's the maximum.
What happens in a more complex case?
maxInt (Node n leftTree rightTree) = ...
Well... we can think that the maximum value for the tree (Node n leftTree rightTree) would be the maximum among n, the maximum value of leftTree and rightTree.
Would you be encouraged to write the second equation? I strongly recommend you to first read the chapter of the book I just linked above. Also, you might want to read about recursion in Haskell.

Generate n-ary binary tree in haskell

I try to create random trees from this structure:
data Tree a = Leaf a
| Funktion (Tree a) (Tree a)
| Lambda (Tree a)
deriving (Show)
The problem I have is that I don't even know how to generate a tree with the depth of (for example) 2 which only has "Lambda" as nodes. If someone could help me with generating this simple tree (depth 2) I could generate them randomly.
If I implement a function like this:
build (Tree a) 0 = Leaf "A"
build (Tree a) n = build (Lambda a) (n-1)
It won't work since the function build itself expects a Tree as input. Actually I need trees which have the nodes Lambda or Funktion, but first of all I need to understand how to generate a simple version of this structure.
It sounds like you want something like
build :: Natural -- ^ Number, n, of Lambda nodes
-> a -- ^ Value, v, to store in the Leaf
-> Tree a -- ^ Lambda^n (Leaf v)
build 0 a = Leaf a
build n a = Lambda (build (n - 1) a)
So that build 4 "A" will produce
Lambda (Lambda (Lambda (Lambda (Leaf "A"))))
Unfortunately, the rest of your question (about generating random trees) really requires substantially more context to answer.
You're close - your build function at the moment will always return a Leaf, as the base case of the recursion doesn't do anything with its argument. You don't actually need the argument:
build :: Integer -> Tree String
build 0 = Leaf "A"
build n = Lambda $ build (n-1)
This would produce what your build function seems to be intending, that is a simple "tree" of given depth n composed of a Leaf node and Lambda nodes. Eg:
λ> build 2
Lambda (Lambda (Leaf "A"))
To get your randomly generated tree, you need to look at the System.Random module.
For example building on the previous, a tree of random height between a given upper and lower bound:
buildRandom :: (Integer, Integer) -> IO (Tree String)
buildRandom bounds = randomRIO bounds >>= return . build
This can be extended and modified to produce the behaviour you want (a fully random generated tree), but requires knowledge of monads etc., which might take some extra reading.

Why in algebraic data types, if I can define a special `from` and `to` function for two types, the two types can be considered equality?

I'm reading this blog: http://chris-taylor.github.io/blog/2013/02/10/the-algebra-of-algebraic-data-types/
It says:
However, when I talk about equality, I don’t mean Haskell equality, in the sense of the (==) function. Instead, I mean that the two types are in one-to-one correspondence – that is, when I say that two types a and b are equal, I mean that you could write two functions
from :: a -> b
to :: b -> a
that pair up values of a with values of b, so that the following equations always hold (here the == is genuine, Haskell-flavored equality):
to (from a) == a
from (to b) == b
And later, there are many laws based on this definition:
Add Void a === a
Add a b === Add b a
Mul Void a === Void
Mul () a === a
Mul a b === Mul b a
I can't understand why we can safely get these laws based on the definition of "equality"? Can use use other definitions? What can we do with this definition? Does it make sense for Haskell type systems?
The term that the author is skating around, so as not to "mention category theory or advanced math", is cardinality. He defines two types to be ===-equal to each other if they have equal cardinality -- that is, if there are as many possible values of one as there are of the other.
Because if two types have equal cardinality, there exists an isomorphism between them. Mul () Bool may be a different type than Bool, but there are exactly as many members of one as the other, and one can trivially define a function to go from one to the other, or the other to the one. (Not that there is only one such isomorphism -- the point is, you could choose one.)
It's not a great approach. It works fine for finite sets, basically, but it introduces unfortunate side effects for infinite sets, like Add Int Int === Int. Still, for the basic description of addition and multiplication of types, it seems to serve.
Informally speaking, when two mathematical structures A,B have two "nice" functions from,to satifying from . to == id and to . from == id, the structures A,B are said to be isomorphic.
The actual definition of "nice" function varies with the kind of structure at hand (and sometimes, different definitions of "nice" give rise to distinct notions of isomorphism).
The idea behind isomorphic structures is that, roughly, they "work" in exactly the same way. For instance, consider a structure A made by the booleans True,False with &&,|| as operations. Let then structure B made of the two naturals1,0 with min,max as operations. These are different structures, yet they share the same rules. For instance True && x == x and 1 `min` x == x for all x. A and B are isomorphic: function to will map True to 1, and False to 0, while from will perform the opposite mapping.
(Note that while we could map True to 0 and False to 1, and we would still get from . to == id and its dual, this mapping would not be considered "nice" since it would not preserve the structure: e.g., to (True && x) == to x yet to (True && x) == to True `min` to x == 0 `min` to x == 0 .)
Another example in a different setting: consider A to be a circle in a plane, while B is a square in such plane. One can then define continuous mappings to,from between them. This can be done with any two "closed loops", loosely speaking, which can be said to be isomorphic. Instead a circle and an "eight" shape do not admit such continuous mappings: the self-intersecting point in the "eight" can not be mapped to any point in the circle in a continuous way (roughly, four "ways" depart from it, while points in the circle have only two such "ways").
In Haskell, types are similarly said to be isomorphic when two Haskell-definable functions from,to exist between them satisfying the rules above. Here being a "nice" function just means being definable in Haskell. The linked web blog shows a few such isomorphic types. Here's another example, using recursive types:
List1 a = Add Unit (Mul a (List1 a))
List2 a = Add Unit (Add a (Mul a (Mul a (List2 a))))
Intuitively, the first reads as: "a list is either the empty list, or a pair made of an element and a list". The second reads as: "a list is either the empty list, or a single element, or a triple make of an element, another element, and a list". One can convert between the two by handling the elements two at a time.
Another example:
Tree a = Add Unit (Mul a (Mul (Tree a) (Tree a)))
You can prove that the type Tree Unit is isomorphic to List1 (Tree Unit) by exploiting the algebraic laws fond in the blog. Below, = stands for isomorphism.
List1 (Tree Unit)
-- definition of List1 a
= Add Unit (Mul (Tree Unit) (List1 (Tree Unit)))
-- by inductive hypothesis, the inner `List1 (Tree Unit)` is isomorphic to `Tree Unit`
= Add Unit (Mul (Tree Unit) (Tree Unit))
-- definition of Tree a
= Tree Unit
The above proof sketch induces the function to as follows.
data Add a b = InL a | InR b
data Mul a b = P a b
type Unit = ()
newtype List1 a = List1 (Add Unit (Mul a (List1 a)))
newtype Tree a = Tree (Add Unit (Mul a (Mul (Tree a) (Tree a))))
to :: List1 (Tree Unit) -> Tree Unit
to (List1 (InL ())) = Tree (InL ())
to (List1 (InR (P t ts))) = Tree (InR (P () (P t (to ts))))
Note how recursive call plays the role the inductive hypothesis has in the proof.
Writing from is left as an exercise :-P
Why in algebraic data types, if I can define a special from and to function for two types, the two types can be considered equal?
Well, the better term to use here isn't "equal," but rather isomorphic. The thing is that when two types are isomorphic, they are basically interchangeable with each other; any program written in terms of A could, in principle, be written in terms of B instead, without changing the meaning of the program. Suppose you have:
from :: A -> B
to :: B -> A
and these two functions constitute an isomorphism, that is:
to (from a) == a
from (to b) == b
Now, if you have any function that takes A as an argument, you can for example write a counterpart that takes B as an argument instead:
foo :: B -> Something
foo = originalFoo . from
where originalFoo :: A -> Something
originalFoo a = ...
And for any function that produces an A, you can likewise do this:
bar :: Something -> B
bar = to . originalBar
where originalBar :: Something -> A
originalBar something = ...
Now you've hidden all uses of A inside the where subdefinitions. You could continue down this path and mechanically eliminate all uses of A entirely, and you're guaranteed the program will work the same as when you started.

Catamorphism and tree-traversing in Haskell

I am impatient, looking forward to understanding catamorphism related to this SO question :)
I have only practiced the beginning of Real World Haskell tutorial. So, Maybe I'm gonna ask for way too much right now, if it was the case, just tell me the concepts I should learn.
Below, I quote the wikipedia code sample for catamorphism.
I would like to know your opinion about foldTree below, a way of traversing a Tree, compared to this other SO question and answer, also dealing with traversing a Tree n-ary tree traversal. (independantly from being binary or not, I think the catamorphism below can be written so as to manage n-ary tree)
I put in comment what I understand, and be glad if you could correct me, and clarify some things.
{-this is a binary tree definition-}
data Tree a = Leaf a
| Branch (Tree a) (Tree a)
{-I dont understand the structure between{}
however it defines two morphisms, leaf and branch
leaf take an a and returns an r, branch takes two r and returns an r-}
data TreeAlgebra a r = TreeAlgebra { leaf :: a -> r
, branch :: r -> r -> r }
{- foldTree is a morphism that takes: a TreeAlgebra for Tree a with result r, a Tree a
and returns an r -}
foldTree :: TreeAlgebra a r -> Tree a -> r
foldTree a#(TreeAlgebra {leaf = f}) (Leaf x ) = f x
foldTree a#(TreeAlgebra {branch = g}) (Branch l r) = g (foldTree a l) (foldTree a r)
at this point I am having many difficulties, I seem to guess that the morphism leaf
will be applied to any Leaf
But so as to use this code for real, foldTree needs to be fed a defined TreeAlgebra,
a TreeAlgebra that has a defined morphism leaf so as to do something ?
but in this case in the foldTree code I would expect {f = leaf} and not the contrary
Any clarification from you would be really welcome.
Not exactly sure what you're asking. But yeah, you feed a TreeAlgebra to foldTree corresponding to the computation you want to perform on the tree. For example, to sum all the elements in a tree of Ints you would use this algebra:
sumAlgebra :: TreeAlgebra Int Int
sumAlgebra = TreeAlgebra { leaf = id
, branch = (+) }
Which means, to get the sum of a leaf, apply id (do nothing) to the value in the leaf. To get the sum of a branch, add together the sums of each of the children.
The fact that we can say (+) for branch instead of, say, \x y -> sumTree x + sumTree y is the essential property of the catamorphism. It says that to compute some function f on some recursive data structure it suffices to have the values of f for its immediate children.
Haskell is a pretty unique language in that we can formalize the idea of catamorphism abstractly. Let's make a data type for a single node in your tree, parameterized over its children:
data TreeNode a child
= Leaf a
| Branch child child
See what we did there? We just replaced the recursive children with a type of our choosing. This is so that we can put the subtrees' sums there when we are folding.
Now for the really magical thing. I'm going to write this in pseudohaskell -- writing it in real Haskell is possible, but we have to add some annotations to help the typechecker which can be kind of confusing. We take the "fixed point" of a parameterized data type -- that is, constructing a data type T such that T = TreeNode a T. They call this operator Mu.
type Mu f = f (Mu f)
Look carefully here. The argument to Mu isn't a type, like Int or Foo -> Bar. It's a type constructor like Maybe or TreeNode Int -- the argument to Mu itself takes an argument. (The possibility of abstracting over type constructors is one of the things that makes Haskell's type system really stand out in its expressive power).
So the type Mu f is defined as taking f and filling in its type parameter with Mu f itself. I'm going to define a synonym to reduce some of the noise:
type IntNode = TreeNode Int
Expanding Mu IntNode, we get:
Mu IntNode = IntNode (Mu IntNode)
= Leaf Int | Branch (Mu IntNode) (Mu IntNode)
Do you see how Mu IntNode is equivalent to your Tree Int? We have just torn the recursive structure apart and then used Mu to put it back together again. This gives us the advantage that we can talk about all Mu types at once. This gives us what we need to define a catamorphism.
Let's define:
type IntTree = Mu IntNode
I said the essential property of the catamorphism is that to compute some function f, it suffices to have the values of f for its immediate children. Let's call the type of the thing we are trying to compute r, and the data structure node (IntNode would be a possible instantiation of this). So to compute r on a particular node, we need the node with its children replaced with their rs. This computation has type node r -> r. So a catamorphism says that if we have one of these computations, then we can compute r for the entire recursive structure (remember recursion is denoted explicitly here with Mu):
cata :: (node r -> r) -> Mu node -> r
Making this concrete for our example, this looks like:
cata :: (IntNode r -> r) -> IntTree -> r
Restating, if we can take a node with rs for its children and compute an r, then we can compute an r for an entire tree.
In order to actually compute this, we need node to be a Functor -- that is we need to be able to map an arbitrary function over the children of a node.
fmap :: (a -> b) -> node a -> node b
This can be done straightforwardly for IntNode.
fmap f (Leaf x) = Leaf x -- has no children, so stays the same
fmap f (Branch l r) = Branch (f l) (f r) -- apply function to each child
Now, finally, we can give a definition for cata (the Functor node constraint just says that node has a suitable fmap):
cata :: (Functor node) => (node r -> r) -> Mu node -> r
cata f t = f (fmap (cata f) t)
I used the parameter name t for the mnemonic value of "tree". This is an abstract, dense definition, but it is really very simple. It says: recursively perform cata f -- the computation we are doing over the tree -- on each of t's children (which are themselves Mu nodes) to get a node r, and then pass that result to f compute the result for t itself.
Tying this back to the beginning, the algebra you are defining is essentially a way of defining that node r -> r function. Indeed, given a TreeAlgebra, we can easily get the fold function:
foldFunction :: TreeAlgebra a r -> (TreeNode a r -> r)
foldFunction alg (Leaf a) = leaf alg a
foldFunction alg (Branch l r) = branch alg l r
Thus the tree catamorphism can be defined in terms of our generic one as follows:
type Tree a = Mu (TreeNode a)
treeCata :: TreeAlgebra a r -> (Tree a -> r)
treeCata alg = cata (foldFunction alg)
I'm out of time. I know that got really abstract really fast, but I hope it at least gave you a new viewpoint to help your learning. Good luck!
I think you were were asking a question about the {}'s. There is an earlier question with a good discussion of {}'s. Those are called Haskell's record syntax. The other question is why construct the algebra. This is a typical function paradigm where you generalize data as functions.
The most famous example is Church's construction of the Naturals, where f = + 1 and z = 0,
0 = z,
1 = f z,
2 = f (f z),
3 = f (f (f z)),
etc...
What you are seeing is essentially the same idea being applied to a tree. Work the church example and the tree will click.

Resources