Nil Value for Tree a -> a in Haskell - haskell

So I have a tree defined as
data Tree a = Leaf | Node a (Tree a) (Tree a) deriving Show
I know I can define Leaf to be Leaf a. But I really just want my nodes to have values. My problem is that when I do a search I have a return value function of type
Tree a -> a
Since leafs have no value I am confused how to say if you encounter a leaf do nothing. I tried nil, " ", ' ', [] nothing seems to work.
Edit Code
data Tree a = Leaf | Node a (Tree a) (Tree a) deriving Show
breadthFirst :: Tree a -> [a]
breadthFirst x = _breadthFirst [x]
_breadthFirst :: [Tree a] -> [a]
_breadthFirst [] = []
_breadthFirst xs = map treeValue xs ++
_breadthFirst (concat (map immediateChildren xs))
immediateChildren :: Tree a -> [Tree a]
immediateChildren (Leaf) = []
immediateChildren (Node n left right) = [left, right]
treeValue :: Tree a -> a
treeValue (Leaf) = //this is where i need nil
treeValue (Node n left right) = n
test = breadthFirst (Node 1 (Node 2 (Node 4 Leaf Leaf) Leaf) (Node 3 Leaf (Node 5 Leaf Leaf)))
main =
do putStrLn $ show $ test

In Haskell, types do not have an "empty" or nil value by default. When you have something of type Integer, for example, you always have an actual number and never anything like nil, null or None.
Most of the time, this behavior is good. You can never run into null pointer exceptions when you don't expect them, because you can never have nulls when you don't expect them. However, sometimes we really need to have a Nothing value of some sort; your tree function is a perfect example: if we don't find a result in the tree, we have to signify that somehow.
The most obvious way to add a "null" value like this is to just wrap it in a type:
data Nullable a = Null | NotNull a
So if you want an Integer which could also be Null, you just use a Nullable Integer. You could easily add this type yourself; there's nothing special about it.
Happily, the Haskell standard library has a type like this already, just with a different name:
data Maybe a = Nothing | Just a
you can use this type in your tree function as follows:
treeValue :: Tree a -> Maybe a
treeValue (Node value _ _) = Just value
treeValue Leaf = Nothing
You can use a value wrapped in a Maybe by pattern-matching. So if you have a list of [Maybe a] and you want to get a [String] out, you could do this:
showMaybe (Just a) = show a
showMaybe Nothing = ""
myList = map showMaybe listOfMaybes
Finally, there are a bunch of useful functions defined in the Data.Maybe module. For example, there is mapMaybe which maps a list and throws out all the Nothing values. This is probably what you would want to use for your _breadthFirst function, for example.

So my solution in this case would be to use Maybe and mapMaybe. To put it simply, you'd change treeValue to
treeValue :: Tree a -> Maybe a
treeValue (Leaf) = Nothing
treeValue (Node n left right) = Just n
Then instead of using map to combine this, use mapMaybe (from Data.Maybe) which will automatically strip away the Just and ignore it if it's Nothing.
mapMaybe treeValue xs
Voila!
Maybe is Haskell's way of saying "Something might not have a value" and is just defined like this:
data Maybe a = Just a | Nothing
It's the moral equivalent of having a Nullable type. Haskell just makes you acknowledge the fact that you'll have to handle the case where it is "null". When you need them, Data.Maybe has tons of useful functions, like mapMaybe available.

In this case, you can simply use a list comprehension instead of map and get rid of treeValue:
_breadthFirst xs = [n | Node n _ _ <- xs] ++ ...
This works because using a pattern on the left hand side of <- in a list comprehension skips items that don't match the pattern.

This function treeValue :: Tree a -> a can't be a total function, because not all Tree a values actually contain an a for you to return! A Leaf is analogous to the empty list [], which is still of type [a] but doesn't actually contain an a.
The head function from the standard library has the type [a] -> a, and it also can't work all of the time:
*Main> head []
*** Exception: Prelude.head: empty list
You could write treeValue to behave similarly:
treeValue :: Tree a -> a
treeValue Leaf = error "empty tree"
treeValue (Node n _ _) = n
But this won't actually help you, because now map treeValue xs will throw an error if any of the xs are Leaf values.
Experienced Haskellers usually try to avoid using head and functions like it for this very reason. Sure, there's no way to get an a from any given [a] or Tree a, but maybe you don't need to. In your case, you're really trying to get a list of a from a list of Tree, and you're happy for Leaf to simply contribute nothing to the list rather than throw an error. But treeValue :: Tree a -> a doesn't help you build that. It's the wrong function to help you solve your problem.
A function that helps you do whatever you need would be Tree a -> Maybe a, as explained very well in some other answers. That allows you to later decide what to do about the "missing" value. When you go to use the Maybe a, if there's really nothing else to do, you can call error then (or use fromJust which does exactly that), or you can decide what else to do. But a treeValue that claims to be able to return an a from any Tree a and then calls error when it can't denies any caller the ability to decide what to do if there's no a.

Related

Accessing values in haskell custom data type

I'm very new to haskell and need to use a specific data type for a problem I am working on.
data Tree a = Leaf a | Node [Tree a]
deriving (Show, Eq)
So when I make an instance of this e.g Node[Leaf 1, Leaf2, Leaf 3] how do I access these? It won't let me use head or tail or indexing with !! .
You perform pattern matching. For example if you want the first child, you can use:
firstChild :: Tree a -> Maybe (Tree a)
firstChild (Node (h:_)) = Just h
firstChild _ = Nothing
Here we wrap the answer in a Maybe type, since it is possible that we process a Leaf x or a Node [], such that there is no first child.
Or we can for instance obtain the i-th item with:
iThChild :: Int -> Tree a -> Tree a
iThChild i (Node cs) = cs !! i
So here we unwrap the Node constructor, obtain the list of children cs, and then perform cs !! i to obtain the i-th child. Note however that (!!) :: [a] -> Int -> a is usually a bit of an anti-pattern: it is unsafe, since we have no guarantees that the list contains enough elements, and using length is an anti-pattern as well, since the list can have infinite length, so we can no do such bound check.
Usually if one writes algorithms in Haskell, one tends to make use of linear access, and write total functions: functions that always return something.

Haskell: Implementation of a Complete Binary-Leaf Tree

I'm attempting to generate a complete binary-leaf tree using Haskell.
However, I'm not sure whether I am along the right lines.
I have the following code at the moment (not sure if it's right):
data Tree a = Leaf a | Branch (Tree a) (Tree a) deriving (Show,Eq)
listToTree :: [a] -> Tree a
listToTree [] = error "The list cannot be empty"
listToTree [x] = Leaf x
I need the function to take in an input list of any basic types and construct a tree using level first order from the list provided as input. I'm not sure what the best way to do this is.
Any suggestions?

How to find the node that holds the minimum element in a binary tree in Haskell?

I'm trying to solve a similar problem (find the shortest list in a tree of lists) and I think that solving this problem would be a good start.
Given a data type like
data (Ord a, Eq a) => Tree a = Nil | Node (Tree a) a (Tree a)
How to find the node that holds the minimum element in the binary tree above?
Please not that this is not a binary search tree.
I'm trying to think recursively: The minimum is the minimum between the left, right subtrees and the current value. However, I'm struggling to translate this to Haskell code. One of the problems that I am facing is that I want to return the tree and not just the value.
Here's a hint:
You can start by defining, as an auxiliary function, a minimum between only two trees. Nodes are compared according to ther value, ignoring the subtrees, and comparing Nil to any tree t makes t the minimum (in a sense, we think of Nil as the "largest" tree). Coding this can be done by cases:
binaryMin :: Ord a => Tree a -> Tree a -> Tree a
binaryMin Nil t = t
binaryMin t Nil = t
binaryMin (Node l1 v1 r1) (Node l2 v2 r2) = ???
Then, the minimum subtree follows by recursion, exploiting binaryMin:
treeMin :: Ord a => Tree a -> Tree a
treeMin Nil = Nil
treeMin (Node left value right) = ???
where l = treeMin left
r = treeMin right
Note: class constraints on datatype declarations are no longer supported in Haskell 2010, and generally not recommended. So do this instead:
data Tree a = Nil
| Node (Tree a) a (Tree a)
Think recursively:
getMinTree :: Ord a => Tree a -> Maybe (Tree a)
getMinTree = snd <=< getMinTree'
getMinTree' :: Ord a => Tree a -> Maybe (a, Tree a)
getMinTree' Nil = ???
getMinTree' (Node Nil value Nil) = ???
getMinTree' (Node Nil value right) = ???
getMinTree' (Node left value Nil) = ???
getMin' (Node left value right) = ???
Note also: there is no need for an Eq constraint. Since Eq is a superclass of Ord, an Ord constraint implies an Eq constraint. I don't think you're even likely to use == for this.
You have the correct understanding. I think you should be fine when you can prove the following:
Can the tree with min be Nil
The tree with min probably has the min value at the root
So instead of just comparing the values, you might need to pattern-match the subtrees to get the value of the root node.
You didn't mention what the type of the function is. So, let's suppose it looks like so:
findmin :: Tree a -> Tree a
Now, suppose you already have a function that finds min of the tree. Something like:
findmin Nil = Nil -- no tree - no min
findmin (Node l x r) = case (findmin ..., findmin ...) of
-- like you said, find min of the left, and find min of the right
-- there will be a few cases of what the min is like on the left and right
-- so you can compare them to x
some case -> how to find the min in this case
some other case -> how to find min in that other case
Perhaps, you will need to know the answers to the first two questions.
It is so hard to give an answer short of giving away the actual code, since your thinking is already correct.

Finding a linear path in a traversable structure

Given a list of steps:
>>> let path = ["item1", "item2", "item3", "item4", "item5"]
And a labeled Tree:
>>> import Data.Tree
>>> let tree = Node "item1" [Node "itemA" [], Node "item2" [Node "item3" []]]
I'd like a function that goes through the steps in path matching the labels in tree until it can't go any further because there are no more labels matching the steps. Concretely, here it falls when stepping into "item4" (for my use case I still need to specify the last matched step):
>>> trav path tree
["item3", "item4", "item5"]
If I allow [String] -> Tree String -> [String] as the type of trav I could write a recursive function that steps in both structures at the same time until there are no labels to match the step. But I was wondering if a more general type could be used, specifically for Tree. For example: Foldable t => [String] -> t String -> [String]. If this is possible, how trav could be implemented?
I suspect there could be a way to do it using lens.
First, please let's use type Label = String. String is not exactly descriptive and might not be ideal in the end...
Now. To use Traversable, you need to pick a suitable Applicative that can contain the information you need for deciding what to do in its "structure". You only need to pass back information after a match has failed. That sounds like some Either!
A guess would thus be Either [Label] (t Label) as the pre-result. That would mean, we use the instantiation
traverse :: Traversable t
=> (Label -> Either [Label] Label) -> t Label -> Either [Label] (t Label)
So what can we pass as the argument function?
travPt0 :: [Label] -> Label -> Either [Label] Label
travPt0 ls#(l0 : _) label
| l0 /= label = Left ls
| otherwise = Right label ?
The problem is, traverse will then fail immediately and completely if any node has a non-matching label. Traversable doesn't actually have a notion of "selectively" diving down into a data structure, it just passes through everything, always. Actually, we only want to match on the topmost node at first, only that one is mandatory to match at first.
One way to circumvent immediate deep-traversal is to first split up the tree into a tree of sub-trees. Ok, so... we need to extract the topmost label. We need to split the tree in subtrees. Reminds you of anything?
trav' :: (Traversable t, Comonad t) => [Label] -> t Label -> [Label]
trav' (l0 : ls) tree
| top <- extract tree
= if top /= l0 then l0 : ls
else let subtrees = duplicate tree
in ... ?
Now amongst those subtrees, we're basically interested only in the one that matches. This can be determined from the result of trav': if the second element is passed right back again, we have a failure. Unlike normal nomenclature with Either, this means we wish to go on, but not use that branch! So we need to return Either [Label] ().
else case ls of
[] -> [l0]
l1:ls' -> let subtrees = duplicate tree
in case traverse (trav' ls >>> \case
(l1':_)
| l1'==l1 -> Right ()
ls'' -> Left ls''
) subtrees of
Left ls'' -> ls''
Right _ -> l0 : ls -- no matches further down.
I have not tested this code!
We'll take as reference the following recursive model
import Data.List (minimumBy)
import Data.Ord (comparing)
import Data.Tree
-- | Follows a path into a 'Tree' returning steps in the path which
-- are not contained in the 'Tree'
treeTail :: Eq a => [a] -> Tree a -> [a]
treeTail [] _ = []
treeTail (a:as) (Node a' trees)
| a == a' = minimumBy (comparing length)
$ (a:as) : map (treeTail as) trees
| otherwise = as
which suggests that the mechanism here is less that we're traversing through the tree accumulating (which is what a Traversable instance might do) but more that we're stepping through the tree according to some state and searching for the deepest path.
We can characterize this "step" by a Prism if we like.
import Control.Lens
step :: Eq a => a -> Prism' (Tree a) (Forest a)
step a =
prism' (Node a)
(\n -> if rootLabel n == a
then Just (subForest n)
else Nothing)
This would allow us to write the algorithm as
treeTail :: Eq a => [a] -> Tree a -> [a]
treeTail [] _ = []
treeTail pth#(a:as) t =
maybe (a:as)
(minimumBy (comparing length) . (pth:) . map (treeTail as))
(t ^? step a)
but I'm not sure that's significantly more clear.

QuickCheck giving up investigating a recursive data structure (rose tree.)

Given an arbitrary tree, I can construct a subtype relation over that tree, using Schubert numbering:
constructH :: Tree a -> Tree (Type a)
where Type nests the original label, and additionally provides the data needed to perform child/parent (or subtype) checks. With Schubert Numbering, the two Int parameters are sufficient for that.
data Type a where !Int -> !Int -> a -> Type a
This leads to the binary predicate
subtypeOf :: Type a -> Type a -> Bool
I now want to test with QuickCheck that this does indeed do what I want it to do. The following property, however, does not work, because QuickCheck just gives up:
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (\x → x `subtypeOf` t):(map subtypeSanity f))
If I leave out the recursive call to subtypeSanity, i.e. the tail of the list I'm passing to conjoin, the property runs fine, but tests just the root node of the tree! How can I descend into my data structure recursively without QuickCheck giving up on generating new test cases?
If needed, I could provide the code to construct the Schubert Hierarchy, and the Arbitrary instance for Tree (Type a), to provide a complete runnable example, but that would be quite a bit of code. I'm convinced that I'm just not "getting" QuickCheck, and using it in the wrong way here.
EDIT: unfortunately, the sized function does not seem to eliminate the problem here. It ends up with the same result (see comment to J. Abrahamson's answer.)
EDIT II: I ended up "fixing" my problem by avoiding the recursive step, and avoiding conjoin. We just make a list of all nodes in the tree, then test the single-node property (which worked fine from the beginning) on those.
allNodes ∷ Tree a → [Tree a]
allNodes n#(Node { subForest = f }) = n:(concatMap allNodes f)
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity tree = forAll (elements $ allNodes tree)
(\(Node { rootLabel = t, subForest = f }) →
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> forAll (elements subtypes) (\x → x `subtypeOf` t))
Tweaking the Arbitrary instance for trees did not work. Here is the arbitrary instance I'm still using:
instance (Arbitrary a, Eq a) ⇒ Arbitrary (Tree (Type a)) where
arbitrary = liftM (constructH) $ sized arbTree
arbTree ∷ Arbitrary a ⇒ Int → Gen (Tree a)
arbTree n = do
m ← choose (0,n)
if m == 0
then Node <$> arbitrary <*> (return [])
else do part ← randomPartition n m
Node <$> arbitrary <*> mapM arbTree part
-- this is a crude way to find a sufficiently random x1,..,xm,
-- such that x1 + .. + xm = n, for any n, m, with 0 < m.
randomPartition ∷ Int → Int → Gen [Int]
randomPartition n m' = do
let m = m' - 1
seed ← liftM ((++[n]) . sort) $ replicateM m (choose (0,n))
return $ zipWith (-) seed (0:seed)
I consider the problem "solved for now," but if someone could explain to me why the recursive step and/or conjoin made QuickCheck give up (after passing "only" 0 tests,) I would be more than grateful.
When generating Arbitrary recursive structures, QuickCheck is often a bit too eager and generates sprawling, enormous random examples. These are undesirable as they usually don't better check the properties of interest and can be very slow. Two solutions are
Use things like the size parameter (sized function) and frequency function to bias the generator toward small trees.
Use a small-type oriented generator like those in smallcheck. These try to exhaustively generate all "small" examples and thus help to keep the size of the tree down.
To clarify the sized and frequency method of controlling generation size, here's an example RoseTree
data Rose a = It a | Rose [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = frequency
[ (3, It <$> arbitrary) -- The 3-to-1 ratio is chosen, ah,
-- arbitrarily...
-- you'll want to tune it
, (1, Rose <$> children)
]
where children = sized $ \n -> vectorOf n arbitrary
It can be done even more simply with a different Rose formation by very carefully controlling the size of the child list
data Rose a = Rose a [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = Rose <$> arbitrary <*> sized (\n -> vectorOf (tuneUp n) arbitrary)
where tuneUp n = round $ fromIntegral n / 4.0
You could do this without referencing sized, but that gives the user of your Arbitrary instance a knob to ask for larger trees if needed.
In case it's useful for those stumbling across this issue: when QuickCheck "gives up", it's a sign that your pre-condition (using ==>) is too hard to satisfy.
QuickCheck uses a simple rejection sampling technique: pre-conditions have no effect on the generation of values. QuickCheck generates a bunch of random values like normal. After these are generated, they're sent through the pre-condition: if the result is True, the property is tested with that value; if it's False, that value is discarded. If your pre-condition rejects most of the values QuickCheck has generated, then QuickCheck will "give up" (better to give up completely, than to make statistically dubious pass/fail claims).
In particular, QuickCheck will not attempt to produce values which satisfy a given pre-condition. It's up to you to make sure that the generator you're using (arbitrary or otherwise) produces lots of values which pass your pre-condition.
Let's see how this is manifesting in your example:
subtypeSanity :: Tree (Type ()) -> Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (`subtypeOf` t):(map subtypeSanity f))
There is only one occurance of ==>, so its precondition (not $ null subtypes) must be too hard to satisfy. This is due to the recursive call map subtypeSanity f: not only are you rejecting any Tree which has an empty subForest, you're also (due to the recursion) rejecting any Tree where the subForest contains Trees with empty subForests, and rejecting any Tree where the subForest contains Trees with subForests containing Trees with empty subForests, and so on.
According to your arbitrary instance, Trees are only nested to finite depth: eventually we will always reach an empty subForest, hence your recursive precondition will always fail, and QuickCheck will give up.

Resources