Pattern matching warning with binary tree - haskell

I'm a beginner at Haskell and am having some trouble with understanding the warnings I get. I have implemented a binary tree,
data Tree a = Nil | Node a (Tree a) (Tree a) deriving (Eq, Show,
Read)
and it works fine but I get incomplete patterns warning on this code
get :: Ord a => a -> Tree a -> Maybe a
get _ Nil = Nothing
get x (Node v lt rt)
| x == v = Just x
| x < v = get x lt
| x > v = get x rt
The pattern it wants me to match is _ (Node _ _ _ ). I'm not sure what this pattern means?

There are two problems here. First of all, the datatype:
data Tree a = Nil | Node a (Tree left) (Tree right) deriving (Eq, Show, Read)
-- ^ left? ^ right?
In your data definition, you make use of left and right, but those are not defined in the head of the data definition, therefore these are not type parameters. You probably wanted to say:
data Tree a = Nil
| Node { value :: a, left :: Tree a, right :: Tree a}
deriving (Eq, Show, Read)
But now we still get an error:
hs.hs:5:1: Warning:
Pattern match(es) are non-exhaustive
In an equation for ‘get’: Patterns not matched: _ (Node _ _ _)
Ok, modules loaded: Main.
The problem here is that Haskell does not know that two values can only be <, == or >).
If you write an instance of Ord, then you have a "contact" that you will define a total ordering. In other words, for any two values x and y, it holds that x < y, x > y or x == y. The problem is however that Haskell does not know that. For Haskell any of the functions (<), (==) or (>) can result in True or False. Therefore - since a compiler is always conservative - it considers the case where there are two values such that all x < y, x == y and x > y fail (say that you hypothetically would have written foo x y, bar x y and qux x y then this definitely could happens since those are three blackbox functions). You can resolve it by writing otherwise in the last case:
get :: Ord a => a -> Tree a -> Maybe a
get _ Nil = Nothing
get x (Node v lt rt)
| x == v = Just x
| x < v = get x lt
| otherwise = get x rt
otherwise is an alias for True and therefore there is no possibility not to take that branch. So now the conservative compiler understands that, regardless what the values of x and y are, it will always take some branch, because if it does not take the first two, it will certainly take the last one.
You may think that it is weird, but since the contracts are usually not specified in a formal language (only in the documentation, so a natural language), the compiler has no means to know that: you could as a programmer decide not to respect the contracts (but note that this is a very bad idea). Even if you write a formal contract usually as a programmer you still can decide not to respect it and furthermore a compiler cannot always do the required logical reasoning about the formal contracts.

Willem Van Onsem has already explained the issue well. I only want to add that it is possible to perform a comparison between x and v in a very similar way to the posted code, whose branches are however found exhaustive by the compiler.
Instead of
get :: Ord a => a -> Tree a -> Maybe a
get _ Nil = Nothing
get x (Node v lt rt)
| x == v = Just x
| x < v = get x lt
| x > v = get x rt
simply use
get :: Ord a => a -> Tree a -> Maybe a
get _ Nil = Nothing
get x (Node v lt rt) = case compare x v of
EQ -> Just x
LT -> get x lt
GT -> get x rt
Indeed, compare is a function taking two arguments and returning a value in the enumerated type Ordering, which can only be EQ (equal), LT (less than), and GT (greater than). Since this is an algebraic type, GHC can see that all its constructors are handled by the case.
Further, depending on the actual type a, using compare can be more efficient. E.g., when comparing two potentially long strings, it's suboptimal to traverse them twice (if not three times, in the original code): compare does only a single pass to both strings and determines which order relation holds.

Related

Searching for a value in a TriTree in Haskell

I recently just started learning Haskell and I am trying to implement a function for searching for a specific value in a tri tree which returns true if the value is present and false otherwise. This is how my type looks:
data TriTree a
= Empty
| NodeOne a (TriTree a) (TriTree a) (TriTree a)
| NodeTwo a a (TriTree a) (TriTree a) (TriTree a)
deriving (Show)
This tree basically is empty or which contains at least one Internal Node. In which each Internal Nodes store one or two data values as well as have max three child nodes (left, middle, right).
It's not clear to me how to proceed with the search function to traverse through the tree and return the value.
Since you defined a recursive data structure, it makes sense to have a recursive function to parse it. To do that I'd start with anchoring the recursion in the trivial case. Since an empty tree doesn't contain anything, the check will always be false:
elem' _ Empty = False
Now to the recursive part: In the case of a NodeOne, we need to check if the value is inside that node or in any of the subtrees of that node, so we check if
elem' x (NodeOne v a b c) = x == v || x `elem'` a || x `elem'` b || x `elem'` c
The remaining case is for NodeTwo and I leave that for you to figure out, which shouldn't be difficult to as it is just a generalization of the line above:
elem' x _ = undefined -- remaining case
Try it online!

Determine if binary tree is BST haskell

I'm trying to write a bool function to return True if a binary tree is a bst using recursion, and I need a little guidance on haskell syntax.
I understand that for a binary tree to be a bst, the left subtree must always contain only nodes less than the head. and the right subtree must always contain only nodes greater than the head. I was structuring my function as such:
isBST :: Tree -> Bool --recieve Tree, return bool
isBST (Lead i) = True --return true if its only one leaf in tree
isBST (Node h l r) = if (((isBST l) < h) && ((isBST r) > h)) then True else False
--return true if left subtree < head AND right subtree > head
But this code results in the error:
Couldn't match expected type ‘Bool’ with actual type ‘Int’
Referring to the < h and > h parts specifically. Is it something wrong with my haskell formatting? Thanks in advance
Is it something wrong with my haskell formatting?
No, it is a semantical error. You write:
(isBST l) < h
So this means you ask Haskell to determine whether l is a binary search tree, which is True or False, but you can not compare True or False with h. Even if you could (some languages see True as 1 and False as 0), then it would still be incorrect, since we want to know whether all nodes in the left subtree are less than h.
So we will somehow need to define bounds. A way to do this is to pass parameters through the recursion and perform checks. A problem with this is that the root of the tree for example, has no bounds. We can fix this by using a Maybe Int is a boundary: if it is Nothing, the boundary is "inactive" so to speak, if it is Just b, then the boundary is "active" with value b.
In order to make this check more convenient, we can first write a way to check this:
checkBound :: (a -> a -> Bool) -> Maybe a -> a -> Bool
checkBound _ Nothing _ = True
checkBound f (Just b) x = f b x
So now we can make a "sandwich check" with:
sandwich :: Ord a => Maybe a -> Maybe a -> a -> Bool
sandwich low upp x = checkBound (<) low x && checkBound (>) upp x
So sandwich is given a lowerbound and an upperbound (both Maybe as), and a value, and checks the lower and upper bounds.
So we can write a function isBST' with:
isBST' :: Maybe Int -> Maybe Int -> Tree -> Bool
isBST' low upp ... = ....
There are two cases we need to take into account: the Leaf x case, in which the "sandwich constraint" should be satisfied, and the Node h l r case in which h should satisfy the "sandwich constraint" and furthermore l and r should satsify different sandwhich constraints. For the Leaf x it is thus like:
isBST' low upp (Leaf x) = sandwich low upp x
For the node case, we first check the same constraint, and then enforce a sandwich between low and h for the left part l, and a sandwich between h and upp for the right part r, so:
isBST' low upp (Node h l r) = sandwich low upp h &&
isBST' low jh l &&
isBST' jh upp r
where jh = Just h
Now the only problem we still have is to call isBST' with the root element: here we use Nothing as intial bounds, so:
isBST :: Tree -> Bool
isBST = isBST' Nothing Nothing
There are of course other ways to enforce constraints, like passing and updating functions, or by implement four variants of the isBST' function that check a subset of the constraints.
Martin, I'd recommend you to look at Willem's answer.
Another thing, you could also use your maxInt function that you asked in a previous question to define this function:
isBST (Node h l r) = ... (maxInt l) ... -- at some point we will need to use this
Taking your definition of BSTs:
I understand that for a binary tree to be a bst, the left subtree must
always contain only nodes less than the head. and the right subtree
must always contain only nodes greater than the head.
I'll add that also the subtrees of a node should be BSTs as well.
So we can define this requirement with:
isBST (Node h l r) =
((maxInt l) < h) -- the left subtree must contain nodes less than the head
&& ((minInt r) > h) -- the right must contain nodes greater than the head
&& (...) -- the left subtree should be a BST
&& (...) -- the right subtree should be a BST
Recall that you might need to define minInt :: Tree -> Int, as you probably know how to do that.
I like Willem Van Onsem's pedagogical approach in his answer.
I was going to delete my answer, but am going to post a "correction" instead, at the risk of being wrong again:
data Tree = Empty | Node Int Tree Tree deriving show
isBST :: Tree -> Bool
isBST Empty = True
isBST (Node h l r) = f (<=h) l && f (>=h) r && isBST l && isBST r
where
f _ Empty = True
f c (Node h l r) = c h && f c l && f c r
Note that I'm using Wikipedia's definition of BST, that
the key in each node must be greater than or equal to any key stored
in the left sub-tree, and less than or equal to any key stored in the
right sub-tree.

Check if the list is ascending (Haskell)

I am writing a function to check if a tree if a BST. All I've tried is to print the tree in an in-order traversal to a list and then check if the list is increasing. However I am having this error:
Couldn't match expected type `a' against inferred type `[t]'
`a' is a rigid type variable bound by
the type signature for `checkList' at BST.hs:24:18
In the pattern: x : y : xs
In the pattern: [x : y : xs]
In the definition of `checkList':
checkList [x : y : xs] = x <= y && checkList (y : xs)
Here is what I have so far (only a checkList function).
checkList :: (Ord a) => [a] -> Bool
checkList [] = True
checkList [x] = True
checkList [x:y:xs] = x <= y && checkList (y:xs)
You want:
checkList :: (Ord a) => [a] -> Bool
checkList [] = True
checkList [x] = True
checkList (x:y:xs) = x <= y && checkList (y:xs)
When you tried to use [ ] in the final pattern, you were saying "match against a list that contains x:y:xs (also a list!) as its sole element". Which doesn't match the type [a].
A somewhat ugly one using foldl'
checkList :: Ord a => [a] -> Bool
checkList xs = fst $ foldl' (\(b,x1) x2 -> (b && x1 <= x2,x2)) (True,head xs) xs
Note: Using head xs is OK here because of lazy evaluation.
The usual way to do this is to make your tree foldable:
data BST a = Node (BST a) a (BST a) | Leaf
-- Use `deriving Foldable` or this instance
instance Foldable BST where
foldMap _ Leaf = mempty
foldMap f (Node l v r) =
foldMap f l <> (f v <> foldMap f r)
Then you can skip conversion to a list like this. This is similar to bmk's answer, but avoids head.
-- Is this increasing? If so, what is the maximum?
data Result a = Empty | NotInc | Inc a
finalInc :: Result a -> Bool
finalInc NotInc = False
finalInc _ = True
increasing :: (Foldable f, Ord a) => f a -> Bool
increasing = finalInc . foldl' go Empty where
go Empty y = Inc y
go NotInc _ = NotInc
go (Inc x) y
| x <= y = Inc y
| otherwise = NotInc
Warning! Warning!
The property this checks is weaker than the traditional binary search tree property, and weaker than the commonly accepted ways to weaken that property. In particular, you generally want to ensure, at least, that the root of each subtree is strictly greater than all elements of its left subtree, or that the root of each subtree is strictly less than all elements of its right subtree. These weak properties cannot be expressed in terms of the Foldable instance or conversion to a list; they must be checked directly. You can, however, use these techniques to verify the classical BST property by simply replacing <= with <.
A remark on space
All of the answers, including this one, have a somewhat unfortunate property: given a very left-heavy tree (e.g., Node (Node (...) 2 Leaf) 1 Leaf) they will use O(n) additional space to verify the search tree property. Is there some way to write this so it won't have any such bad cases? Unfortunately, the answer seems to be no. The classical BST property can be stated thus:
Each node must be greater than all elements of its left subtree and less than all elements of its right subtree.
The trouble is that "and". If we decide to check the left subtree first, we have to remember to check the right subtree afterwards, and vice versa.
Thus the only way to make verification efficient is to ensure that the tree is balanced.

Working with Trees in Haskell

I have this data definition for a tree:
data Tree = Leaf Int | Node Tree Int Tree
and I have to make a function, nSatisfy, to check how many items of the tree check some predicate.
Here's what I've done:
nSatisfy :: (Int->Bool) -> Tree -> Int
nSatisfy _ Leaf = 0
nSatisfy y (Node left x right)
|y x = 1 + nSatisfy y (Node left x right)
| otherwise = nSatisfy y (Node left x right)
Is this the right way to solve this problem?
In your nSatisfy function, you should add the number of nodes satisfying the condition in both subtrees with two recursive calls. The last two lines should be like this:
|x y=1+(nSatisfy y left)+(nSatisfy y right)
|otherwise=(nSatisfy y left)+(nSatisfy y right)
This way, it will call itself again on the same node but only on the subtrees.
Also, if a leaf contains an integer, as is implied in the data declaration, you should make it evaluate the condition for a leaf and return 1 if it is true, instead of always returning 0.
In addition to the main answer, I'd like to offer a slightly different way how to generalize your problem and solving it using existing libraries.
The operation you're seeking is common to many data structures - to go through all elements and perform some operation on them. Haskell defines Foldable type-class, which can be implemented by structures like yours.
First let's import some modules we'll need:
import Data.Foldable
import Data.Monoid
In order to use Foldable, we need to generalize the structure a bit, in particular parametrize its content:
data Tree a = Leaf a | Node (Tree a) a (Tree a)
In many cases this is a good idea as it separates the structure from its content and allows it to be easily reused.
Now let's define its Foldable instance. For tree-like structures it's easier to define it using foldMap, which maps each element into a monoid and then combines all values:
instance Foldable Tree where
foldMap f (Leaf x) = f x
foldMap f (Node lt x rt) = foldMap f lt <> f x <> foldMap f rt
This immediately gives us the whole library of functions in the Data.Foldable module, such as searching for an element, different kinds of folds, etc. While a function counting the number of values satisfying some predicate isn't defined there, we can easily define it for any Foldable. The idea is that we'll use the Sum:
nSatisfy :: (Foldable f) => (a -> Bool) -> f a -> Int
nSatisfy p = getSum . foldMap (\x -> Sum $ if p x then 1 else 0)
The idea behind this function is simple: Map each value to 1 if it satisfies the predicate, otherwise to 0. And then folding with the Sum monoid just adds all values up.

Haskell 2-3-4 Tree

We've been asked to create a 2-3-4 tree in Haskell, as in write the data type, the insert function, and a display function.
I'm finding it very difficult to get information on this kind of tree, even in a language I'm comfortable with (Java, C++).
What I have so far -
data Tree t = Empty
| Two t (Tree t)(Tree t)
| Three t t (Tree t)(Tree t)(Tree t)
| Four t t t (Tree t)(Tree t)(Tree t)(Tree t) deriving (Eq, Ord, Show)
leaf2 a = Two a Empty Empty
leaf3 a b = Three a b Empty Empty Empty
leaf4 a b c = Four a b c Empty Empty Empty Empty
addNode::(Ord t) => t -> Tree t -> Tree t
addNode t Empty = leaf2 t
addNode x (Two t left right)
| x < t = Two t (addNode x left) right
| otherwise = Two t left (addNode x right)
This compiles but I'm not sure if it's correct, but not sure how to start writing the insert into a three node or four node.
The assignment also says that "deriving show" for the display function is not enough, that it should print out the tree in the format normally seen in diagrams. Again, unsure on the way to go with this.
Any help or direction appreciated.
I know nothing about 2-3-4 trees, but for the Three node, you would start with something like this:
addNode t (Three x y left mid right)
| cond1 = expr1
| cond2 = expr2
(etc)
What cond1, cond2, expr1, and expr2 are, exactly, is dependent on the definition of what a 2-3-4 tree is.
As for a show method, the general outline would be this:
instance (Show t) => Show (Tree t) where
show Empty = ...
show (Two x l r) = ...show x...show l...show r...
show (Three x y l m r) = ...
show (Four x y z l m n r) = ...
The implementation depends on how you want it to look, but for the non-Empty cases, you will probably invoke show on all of the components of the tree being shown. If you want to indent the nested parts of the tree, then perhaps you should create a separate method:
instance (Show t) => Show (Tree t) where
show = showTree 0
showTree :: Show t => Int -> Tree t -> String
showTree n = indent . go
where indent = (replicate n ' ' ++)
go Empty = "Empty"
go (Two x l r) = (...show x...showTree (n+1) l...showTree (n+1) r...)
(etc)
We've been asked to create a 2-3-4 tree
My condolences. I myself once had to implement one for homework. A 2-3-4 tree is a B-tree with all the disadvantages of the B-tree and none of the advantages, because writing the cases separately for each number of children as you do is as cumbersome as having a list of only 2-4 elements.
Point being: B-tree insertion algorithms should work, just fix the size. Cormen et al. have pseudocode for one in their book Introduction to algorithms (heavy imperativeness warning!).
It might still be better to have lists of data elements and children instead of the four-case algebraic data type, even if the type wouldn't enforce the size of the nodes then. At least it would make it easier to expand the node size.

Resources