Generate n-ary binary tree in haskell - haskell

I try to create random trees from this structure:
data Tree a = Leaf a
| Funktion (Tree a) (Tree a)
| Lambda (Tree a)
deriving (Show)
The problem I have is that I don't even know how to generate a tree with the depth of (for example) 2 which only has "Lambda" as nodes. If someone could help me with generating this simple tree (depth 2) I could generate them randomly.
If I implement a function like this:
build (Tree a) 0 = Leaf "A"
build (Tree a) n = build (Lambda a) (n-1)
It won't work since the function build itself expects a Tree as input. Actually I need trees which have the nodes Lambda or Funktion, but first of all I need to understand how to generate a simple version of this structure.

It sounds like you want something like
build :: Natural -- ^ Number, n, of Lambda nodes
-> a -- ^ Value, v, to store in the Leaf
-> Tree a -- ^ Lambda^n (Leaf v)
build 0 a = Leaf a
build n a = Lambda (build (n - 1) a)
So that build 4 "A" will produce
Lambda (Lambda (Lambda (Lambda (Leaf "A"))))
Unfortunately, the rest of your question (about generating random trees) really requires substantially more context to answer.

You're close - your build function at the moment will always return a Leaf, as the base case of the recursion doesn't do anything with its argument. You don't actually need the argument:
build :: Integer -> Tree String
build 0 = Leaf "A"
build n = Lambda $ build (n-1)
This would produce what your build function seems to be intending, that is a simple "tree" of given depth n composed of a Leaf node and Lambda nodes. Eg:
λ> build 2
Lambda (Lambda (Leaf "A"))
To get your randomly generated tree, you need to look at the System.Random module.
For example building on the previous, a tree of random height between a given upper and lower bound:
buildRandom :: (Integer, Integer) -> IO (Tree String)
buildRandom bounds = randomRIO bounds >>= return . build
This can be extended and modified to produce the behaviour you want (a fully random generated tree), but requires knowledge of monads etc., which might take some extra reading.

Related

Attempting to construct trees in Haskell

I am trying to use an unfold function to build trees.
Tree t = Leaf | Node (Tree t) t (Tree t)
unfoldT :: (b -> Maybe (b, a, b)) -> b -> Tree a
unfoldT f b =
case f b of
Nothing -> Leaf
Just (lt, x, rt) -> Node (unfoldT f lt) x (unfoldT f rt)
The build function needs to create a tree that has a height equal to the number provided, as well as be numbered in an in-order fashion. The base case being build 0 = Leaf and the next being build 1 = Node (Leaf 0 Leaf).
build :: Integer -> Tree Integer
My attempt at solving it:
build n = unfoldT (\x -> Just x) [0..2^n-2]
I am not entirely sure how to go about constructing the tree here.
Would love it if somebody could point me in the right direction.
Edit 1:
If I was to use a 2-tuple, what would I combine? I need to be able to refer to the current node, its left subtree and its right subtree somehow right?
If I was to use a 2-tuple, what would I combine?
I would recommend to pass the remaining depth as well as the offset from the left:
build = unfoldT level . (0,)
where
level (_, 0) = Nothing
level (o, n) = let mid = 2^(n-1)
in ((o, n-1), o+mid-1, (o+mid, n-1))
If I was to use a 2-tuple, what would I combine?
That's the key question behind the state-passing paradigm in functional programming, expressed also with the State Monad. We won't be dealing with the latter here, but maybe use the former.
But before that, do we really need to generate all the numbers in a list, and then work off that list? Don't we know in advance what are the numbers we'll be working with?
Of course we do, because the tree we're building is totally balanced and fully populated.
So if we have a function like
-- build2 (depth, startNum)
build2 :: (Int, Int) -> Tree Int
we can use it just the same to construct both halves of e.g. the build [0..14] tree:
build [0..14] == build2 (4,0) == Node (build2 (3,0)) 7 (build2 (3,8))
Right?
But if we didn't want to mess with the direct calculations of all the numbers involved, we could arrange for the aforementioned state-passing, with the twist to build2's interface:
-- depth, startNum tree, nextNum
build3 :: (Int, Int) -> (Tree Int, Int)
and use it like
build :: Int -> Tree Int -- correct!
build depth = build3 (depth, 0) -- intentionally incorrect
build3 :: (Int, Int) -> (Tree Int, Int) -- correct!
build3 (depth, start) = Node lt n rt -- intentionally incorrect
where
(lt, n) = build3 (depth-1, start) -- n is returned
(rt, m) = build3 (depth-1, n+1) -- and used, next
You will need to tweak the above to make all the pieces fit together (follow the types!), implementing the missing pieces of course and taking care of the corner / base cases.
Formulating this as an unfold would be the next step.

Searching a Value in Binary tree haskell

I have just started learning Haskell and I am trying to write a code for searching for a particular value in a binary tree and if present return true else false
This is how my tree structure looks like
data Tree = Leaf Int | Node Tree Int Tree
I am not sure how to proceed with the function to traverse through the tree and return the value. I did try BFS and DFS but I am not sure on how to return once I have got my value.
An example of how my function should look
Search 5 (Node (Node (Leaf 1) 3 (Leaf 4)) 5 (Node (Leaf 6) 7 (Leaf 9)))
A binary search could be written as follows. The type can be more generic, as we only need the items to be orderable to store / search in a binary tree.
We visit each node and either return true, or search in 1 of the child nodes.
example Node
5
/ \
3 7
lets search for 7.
We first visit the root. since 5 != 7, we test a child node. Since 7 > 5, we search in the right node, since 7 cannot appear in the left child (all values guaranteed to be lower than 5 on the left child)
If we reach a leaf, we just check if it contains the search term.
search :: Ord a => a -> BinaryTree a -> Bool
search a (Leaf b) = compare a b == EQ
search a (Node left b right)
case compare a b of
EQ -> True
LT -> search a left
GT -> search a right
I am not sure how to proceed with the function to traverse through the tree and return the value.
From that sentence, I understand you would have no problem writing a traversal yourself, but that there is a mental leap you need to take to understand how Haskell works.
You see, you never return anything in Haskell. Returning is fundamentally an imperative statement. Haskell is a declarative language, which means that writing programs is done by stating facts. That nuance can be discomforting, especially if you've been introduced to programming through learning imperative languages like C, Java, JavaScript, etc. But once you truly understand it, you will see how much more expressive and easy declarative programming is.
Because of its strong mathematical roots, in Haskell facts are stated in the form of equations, i.e. expressions where the = sign literally means the left- and right-hand side are equal (whereas in an imperative language, it would probably mean that you assign a value to a variable -- that does not exist in Haskell).
The program #Haleemur Ali wrote is in 1:1 correspondence with how you would write search using math notation:
search(x, t) = { x == y if t = Leaf y
, true if t = Node l y r and x == y
, search(x, l) if t = Node l y r and x < y
, search(x, r) if t = Node l y r and x > y
}
Indeed many times, at least as a beginner, writing Haskell is just a matter of translation, from math notation to Haskell notation. Another interpretation of Haskell programs is as proofs of theorems. Your search is a theorem saying that "if you have a tree and an integer, you can always tell if the integer is somewhere inside the tree". That's what you are telling the compiler when you write a function signature:
search :: Int -> Tree -> Bool
The compiler will only be happy if you write a proof for that theorem ... you probably guessed that the algorithm above is the proof.
An interesting observation is that the algorithm is almost dictated by the shape of the data type. Imagine you wanted to sum all the values in a tree instead:
sum(t) = { x if t = Leaf x
, x + sum(l) + sum(r) if t = Node l x r
}
Every time you want to write an algorithm over a binary tree, you will write something like the above. That is fairly mechanical and repetitive. What if later on you expand your program to deal with rose trees? Tries? You don't want to write the same algorithms and take the risk of making a mistake. One would try to come up with a function that walks down a tree and combines its values (using Haskell notation from now on):
walk :: (Int -> b) -> (b -> b -> b) -> Tree -> b
walk f g (Leaf x) = f x
walk f g (Node l x r) =
let a = walk f g l
b = walk f g r
in g (g (f x) a) b
With this function alone, you can write all manners of traversals on trees:
sum t = walk id (+) t
search x t = walk (== x) (||) t
walk is such a recurring pattern that it has been abstracted. All the data structures that expose the same pattern of recursion are said to be foldable, and the implementation is often so obvious that you can ask the compiler to write it for you, like so:
{-# LANGUAGE DeriveFoldable #-}
data Tree a = Leaf a | Node (Tree a) a (Tree a) deriving (Foldable)
There's even a definition of sum for any foldable data structure.

Accessing values in haskell custom data type

I'm very new to haskell and need to use a specific data type for a problem I am working on.
data Tree a = Leaf a | Node [Tree a]
deriving (Show, Eq)
So when I make an instance of this e.g Node[Leaf 1, Leaf2, Leaf 3] how do I access these? It won't let me use head or tail or indexing with !! .
You perform pattern matching. For example if you want the first child, you can use:
firstChild :: Tree a -> Maybe (Tree a)
firstChild (Node (h:_)) = Just h
firstChild _ = Nothing
Here we wrap the answer in a Maybe type, since it is possible that we process a Leaf x or a Node [], such that there is no first child.
Or we can for instance obtain the i-th item with:
iThChild :: Int -> Tree a -> Tree a
iThChild i (Node cs) = cs !! i
So here we unwrap the Node constructor, obtain the list of children cs, and then perform cs !! i to obtain the i-th child. Note however that (!!) :: [a] -> Int -> a is usually a bit of an anti-pattern: it is unsafe, since we have no guarantees that the list contains enough elements, and using length is an anti-pattern as well, since the list can have infinite length, so we can no do such bound check.
Usually if one writes algorithms in Haskell, one tends to make use of linear access, and write total functions: functions that always return something.

Haskell Defining a Binary Tree

I want to define an infinite tree in Haskell using infinitree :: Tree, but want to set a pattern up for each node, defining what each node should be. The pattern is 1 more then then its parent. I am struggling on how to set up a tree to begin with, and how and where to define the pattern of each node?
Thank you
Infinite data structures can generally be defined by functions which call themselves but have no base case. Usually these functions don't need to pattern match on their arguments. For example, a list equal to [1..] can be written as
infiniteList :: [Int]
infiniteList = go 1 where
go n = n : go (n+1)
You can use the exact same technique for a tree:
data Tree a = Node (Tree a) a (Tree a) | Nil deriving (Show)
infiniteTree :: Tree Int
infiniteTree = go 1 where
go n = Node (go (2*n)) n (go (2*n+1))
This defines the infinite tree
1
/ \
2 3
/ \ / \
4 5 6 7
...
A type for infinite binary trees with no leaves:
data Tree a = Tree (Tree a) a (Tree a)
One general pattern for doing this sort of thing is called unfold. For this particular type:
unfold :: (a -> (a,b,a)) -> a -> Tree b
Can you see how to define this function and use it for your purpose?

QuickCheck giving up investigating a recursive data structure (rose tree.)

Given an arbitrary tree, I can construct a subtype relation over that tree, using Schubert numbering:
constructH :: Tree a -> Tree (Type a)
where Type nests the original label, and additionally provides the data needed to perform child/parent (or subtype) checks. With Schubert Numbering, the two Int parameters are sufficient for that.
data Type a where !Int -> !Int -> a -> Type a
This leads to the binary predicate
subtypeOf :: Type a -> Type a -> Bool
I now want to test with QuickCheck that this does indeed do what I want it to do. The following property, however, does not work, because QuickCheck just gives up:
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (\x → x `subtypeOf` t):(map subtypeSanity f))
If I leave out the recursive call to subtypeSanity, i.e. the tail of the list I'm passing to conjoin, the property runs fine, but tests just the root node of the tree! How can I descend into my data structure recursively without QuickCheck giving up on generating new test cases?
If needed, I could provide the code to construct the Schubert Hierarchy, and the Arbitrary instance for Tree (Type a), to provide a complete runnable example, but that would be quite a bit of code. I'm convinced that I'm just not "getting" QuickCheck, and using it in the wrong way here.
EDIT: unfortunately, the sized function does not seem to eliminate the problem here. It ends up with the same result (see comment to J. Abrahamson's answer.)
EDIT II: I ended up "fixing" my problem by avoiding the recursive step, and avoiding conjoin. We just make a list of all nodes in the tree, then test the single-node property (which worked fine from the beginning) on those.
allNodes ∷ Tree a → [Tree a]
allNodes n#(Node { subForest = f }) = n:(concatMap allNodes f)
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity tree = forAll (elements $ allNodes tree)
(\(Node { rootLabel = t, subForest = f }) →
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> forAll (elements subtypes) (\x → x `subtypeOf` t))
Tweaking the Arbitrary instance for trees did not work. Here is the arbitrary instance I'm still using:
instance (Arbitrary a, Eq a) ⇒ Arbitrary (Tree (Type a)) where
arbitrary = liftM (constructH) $ sized arbTree
arbTree ∷ Arbitrary a ⇒ Int → Gen (Tree a)
arbTree n = do
m ← choose (0,n)
if m == 0
then Node <$> arbitrary <*> (return [])
else do part ← randomPartition n m
Node <$> arbitrary <*> mapM arbTree part
-- this is a crude way to find a sufficiently random x1,..,xm,
-- such that x1 + .. + xm = n, for any n, m, with 0 < m.
randomPartition ∷ Int → Int → Gen [Int]
randomPartition n m' = do
let m = m' - 1
seed ← liftM ((++[n]) . sort) $ replicateM m (choose (0,n))
return $ zipWith (-) seed (0:seed)
I consider the problem "solved for now," but if someone could explain to me why the recursive step and/or conjoin made QuickCheck give up (after passing "only" 0 tests,) I would be more than grateful.
When generating Arbitrary recursive structures, QuickCheck is often a bit too eager and generates sprawling, enormous random examples. These are undesirable as they usually don't better check the properties of interest and can be very slow. Two solutions are
Use things like the size parameter (sized function) and frequency function to bias the generator toward small trees.
Use a small-type oriented generator like those in smallcheck. These try to exhaustively generate all "small" examples and thus help to keep the size of the tree down.
To clarify the sized and frequency method of controlling generation size, here's an example RoseTree
data Rose a = It a | Rose [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = frequency
[ (3, It <$> arbitrary) -- The 3-to-1 ratio is chosen, ah,
-- arbitrarily...
-- you'll want to tune it
, (1, Rose <$> children)
]
where children = sized $ \n -> vectorOf n arbitrary
It can be done even more simply with a different Rose formation by very carefully controlling the size of the child list
data Rose a = Rose a [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = Rose <$> arbitrary <*> sized (\n -> vectorOf (tuneUp n) arbitrary)
where tuneUp n = round $ fromIntegral n / 4.0
You could do this without referencing sized, but that gives the user of your Arbitrary instance a knob to ask for larger trees if needed.
In case it's useful for those stumbling across this issue: when QuickCheck "gives up", it's a sign that your pre-condition (using ==>) is too hard to satisfy.
QuickCheck uses a simple rejection sampling technique: pre-conditions have no effect on the generation of values. QuickCheck generates a bunch of random values like normal. After these are generated, they're sent through the pre-condition: if the result is True, the property is tested with that value; if it's False, that value is discarded. If your pre-condition rejects most of the values QuickCheck has generated, then QuickCheck will "give up" (better to give up completely, than to make statistically dubious pass/fail claims).
In particular, QuickCheck will not attempt to produce values which satisfy a given pre-condition. It's up to you to make sure that the generator you're using (arbitrary or otherwise) produces lots of values which pass your pre-condition.
Let's see how this is manifesting in your example:
subtypeSanity :: Tree (Type ()) -> Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (`subtypeOf` t):(map subtypeSanity f))
There is only one occurance of ==>, so its precondition (not $ null subtypes) must be too hard to satisfy. This is due to the recursive call map subtypeSanity f: not only are you rejecting any Tree which has an empty subForest, you're also (due to the recursion) rejecting any Tree where the subForest contains Trees with empty subForests, and rejecting any Tree where the subForest contains Trees with subForests containing Trees with empty subForests, and so on.
According to your arbitrary instance, Trees are only nested to finite depth: eventually we will always reach an empty subForest, hence your recursive precondition will always fail, and QuickCheck will give up.

Resources