Should recursive function call terminate or is its implementation faulty? - haskell

I'm working my way through Haskell Programming From First Principles and am stuck proving that I've correctly answered question 1 of the Finally something other than a list exercise subsection. The following function invocation doesn't terminate and I can't tell if that's expected or if my implementation is faulty.
data BinaryTree a = Leaf
| Node (BinaryTree a) a (BinaryTree a)
deriving (Eq, Ord, Show)
unfold :: (a -> Maybe (a, b, a)) -> a -> BinaryTree b
unfold func seed =
case (func seed) of
Just (l, m, r) -> Node (unfold func l) m (unfold func r)
Nothing -> Leaf
I would think that each a/x would eventually be decremented/incremented until it causes the if to take the True branch and return a Nothing/Leaf.
func = (\x -> if (x < 0 || x > 10)
then Nothing
else Just (x - 1, x, x + 1))
unfold func 5
-- Node (Node (Node (Node (Node (Node Leaf 0 ... ad infinitum

The reason the program never terminates can be understood if you try to invoke func outside unfold. Here, I renamed func to unfolder, because the Haskell linter didn't like that it had the same name as the func argument.
Close to the boundaries, unfolder returns these values:
*So> unfolder 1
Just (0,1,2)
*So> unfolder 0
Just (-1,0,1)
*So> unfolder (-1)
Nothing
For unfolder 0, the left branch clearly evaluates to Nothing the next time around, but the right branch is unfolder 1, which evaluates to Just (0,1,2), and so on ad infinitum.

Your implementation generates infinitely many calls to unfold.
Consider the first recursive call, where your initial seed value of 5 is split into 4 and 6 for the left and right subtree. There you split this new seed into 3 and 5 on the left and 5 and 7 on the right, hence you define an infinitely large tree.

Related

Building a Binary Tree (not BST) in Haskell Breadth-First

I recently started using Haskell and it will probably be for a short while. Just being asked to use it to better understand functional programming for a class I am taking at Uni.
Now I have a slight problem I am currently facing with what I am trying to do. I want to build it breadth-first but I think I got my conditions messed up or my conditions are also just wrong.
So essentially if I give it
[“A1-Gate”, “North-Region”, “South-Region”, “Convention Center”, “Rectorate”, “Academic Building1”, “Academic Building2”] and [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2], my tree should come out like
But my test run results are haha not what I expected. So an extra sharp expert in Haskell could possibly help me spot what I am doing wrong.
Output:
*Main> l1 = ["A1-Gate", "North-Region", "South-Region", "Convention Center",
"Rectorate", "Academic Building1", "Academic Building2"]
*Main> l3 = [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]
*Main> parkingtree = createBinaryParkingTree l1 l3
*Main> parkingtree
Node "North-Region" 0.5
(Node "A1-Gate" 0.0 EmptyTree EmptyTree)
(Node "Convention Center" 0.3
(Node "South-Region" 0.7 EmptyTree EmptyTree)
(Node "Academic Building2" 1.4
(Node "Academic Building1" 1.2 EmptyTree EmptyTree)
(Node "Rectorate" 0.6 EmptyTree EmptyTree)))
A-1 Gate should be the root but it ends up being a child with no children so pretty messed up conditions.
If I could get some guidance it would help. Below is what I've written so far::
data Tree = EmptyTree | Node [Char] Float Tree Tree deriving (Show,Eq,Ord)
insertElement location cost EmptyTree =
Node location cost EmptyTree EmptyTree
insertElement newlocation newcost (Node location cost left right) =
if (left == EmptyTree && right == EmptyTree)
then Node location cost (insertElement newlocation newcost EmptyTree)
right
else if (left == EmptyTree && right /= EmptyTree)
then Node location cost (insertElement newlocation newcost EmptyTree)
right
else if (left /= EmptyTree && right == EmptyTree)
then Node location cost left
(insertElement newlocation newcost EmptyTree)
else Node newlocation newcost EmptyTree
(Node location cost left right)
buildBPT [] = EmptyTree
--buildBPT (xs:[]) = insertElement (fst xs) (snd xs) (buildBPT [])
buildBPT (x:xs) = insertElement (fst x) (snd x) (buildBPT xs)
createBinaryParkingTree a b = buildBPT (zip a b)
Thank you for any guidance that might be provided. Yes I have looked at some of the similar questions I do think my problem is different but if you think a certain post has a clear answer that will help I am willing to go and take a look at it.
Here's a corecursive solution.
{-# bft(Xs,T) :- bft( Xs, [T|Q], Q). % if you don't read Prolog, see (*)
bft( [], Nodes , []) :- maplist( =(empty), Nodes).
bft( [X|Xs], [N|Nodes], [L,R|Q]) :- N = node(X,L,R),
bft( Xs, Nodes, Q).
#-}
data Tree a = Empty | Node a (Tree a) (Tree a) deriving Show
bft :: [a] -> Tree a
bft xs = head nodes -- Breadth First Tree
where
nodes = zipWith g (map Just xs ++ repeat Nothing) -- values and
-- Empty leaves...
(pairs $ tail nodes) -- branches...
g (Just x) (lt,rt) = Node x lt rt
g Nothing _ = Empty
pairs ~(a: ~(b:c)) = (a,b) : pairs c
{-
nodes!!0 = g (Just (xs!!0)) (nodes!!1, nodes!!2) .
nodes!!1 = g (Just (xs!!1)) (nodes!!3, nodes!!4) . .
nodes!!2 = g (Just (xs!!2)) (nodes!!5, nodes!!6) . . . .
................ .................
-}
nodes is the breadth-first enumeration of all the subtrees of the result tree. The tree itself is the top subtree, i.e., the first in this list. We create Nodes from each x in the input xs, and when the input
is exhausted we create Emptys by using an indefinite number of Nothings instead (the Empty leaves' true length is length xs + 1 but we don't need to care about that).
And we didn't have to count at all.
Testing:
> bft [1..4]
Node 1 (Node 2 (Node 4 Empty Empty) Empty) (Node 3 Empty Empty)
> bft [1..10]
Node 1
(Node 2
(Node 4
(Node 8 Empty Empty)
(Node 9 Empty Empty))
(Node 5
(Node 10 Empty Empty)
Empty))
(Node 3
(Node 6 Empty Empty)
(Node 7 Empty Empty))
How does it work: the key is g's laziness, that it doesn't force lt's nor rt's value, while the tuple structure is readily served by -- very lazy in its own right -- pairs. So both are just like the not-yet-set variables in that Prolog pseudocode(*), when served as 2nd and 3rd arguments to g. But then, for the next x in xs, the node referred to by this lt becomes the next invocation of g's result.
And then it's rt's turn, etc. And when xs end, and we hit the Nothings, g stops pulling the values from pairs's output altogether. So pairs stops advancing on the nodes too, which is thus never finished though it's defined as an unending stream of Emptys past that point, just to be on the safe side.
(*) Prolog's variables are explicitly set-once: they are allowed to be in a not-yet-assigned state. Haskell's (x:xs) is Prolog's [X | Xs].
The pseudocode: maintain a queue; enqueue "unassigned pointer"; for each x in xs: { set pointer in current head of the queue to Node(x, lt, rt) where lt, rt are unassigned pointers; enqueue lt; enqueue rt; pop queue }; set all pointers remaining in queue to Empty; find resulting tree in the original head of the queue, i.e. the original first "unassigned pointer" (or "empty box" instead of "unassigned pointer" is another option).
This Prolog's "queue" is of course fully persistent: "popping" does not mutate any data structure and doesn't change any outstanding references to the queue's former head -- it just advances the current pointer into the queue. So what's left in the wake of all this queuing, is the bfs-enumeration of the built tree's nodes, with the tree itself its head element -- the tree is its top node, with the two children fully instantiated to the bottom leaves by the time the enumeration is done.
Update: #dfeuer came up with much simplified version of it which is much closer to the Prolog original (that one in the comment at the top of the post), that can be much clearer. Look for more efficient code and discussion and stuff in his post. Using the simple [] instead of dfeuer's use of the more efficient infinite stream type data IS a = a :+ IS a for the sub-trees queue, it becomes
bftree :: [a] -> Tree a
bftree xs = t
where
t : q = go xs q
go [] _ = repeat Empty
go (x:ys) ~(l : ~(r : q)) = Node x l r : go ys q
---READ-- ----READ---- ---WRITE---
{-
xs = [ x x2 x3 x4 x5 x6 x7 x8 … ]
(t:q) = [ t l r ll lr rl rr llr … Empty Empty … … ]
-}
For comparison, the opposite operation of breadth-first enumeration of a tree is
bflist :: Tree a -> [a]
bflist t = [x | Node x _ _ <- q]
where
q = t : go 1 q
go 0 _ = []
go i (Empty : q) = go (i-1) q
go i (Node _ l r : q) = l : r : go (i+1) q
-----READ------ --WRITE--
How does bftree work: t : q is the list of the tree's sub-trees in breadth-first order. A particular invocation of go (x:ys) uses l and r before they are defined by subsequent invocations of go, either with another x further down the ys, or by go [] which always returns Empty. The result t is the very first in this list, the topmost node of the tree, i.e. the tree itself.
This list of tree nodes is created by the recursive invocations of go at the same speed with which the input list of values xs is consumed, but is consumed as the input to go at twice that speed, because each node has two child nodes.
These extra nodes thus must also be defined, as Empty leaves. We don't care how many are needed and simply create an infinite list of them to fulfill any need, although the actual number of empty leaves will be one more than there were xs.
This is actually the same scheme as used in computer science for decades for array-backed trees where tree nodes are placed in breadth-first order in a linear array. Curiously, in such setting both conversions are a no-op -- only our interpretation of the same data is what's changing, our handling of it, how are we interacting with / using it.
Update: the below solution is big-O optimal and (I think) pretty easy to understand, so I'm leaving it here in case anyone's interested. However, Will Ness's solution is much more beautiful and, especially when optimized a bit, can be expected to perform better in practice. It is much more worthy of study!
I'm going to ignore the fake edge labels for now and just focus on the core of what's happening.
A common pattern in algorithm design is that it's sometimes easier to solve a more general problem. So instead of trying to build a tree, I'm going to look at how to build a forest (a list of trees) with a given number of trees. I'll make the node labels polymorphic to avoid having to think about what they look like; you can of course use the same building technique with your original tree type.
data Tree a = Empty | Node a (Tree a) (Tree a)
-- Built a tree from a breadth-first list
bft :: [a] -> Tree a
bft xs = case dff 1 xs of
[] -> Empty
[t] -> t
_ -> error "something went wrong"
-- Build a forest of nonempty trees.
-- The given number indicates the (maximum)
-- number of trees to build.
bff :: Int -> [a] -> [Tree a]
bff _ [] = []
bff n xs = case splitAt n xs of
(front, rear) -> combine front (bff (2 * n) rear)
where
combine :: [a] -> [Tree a] -> [Tree a]
-- you write this
Here's a full, industrial-strength, maximally lazy implementation. This is the most efficient version I've been able to come up with that's as lazy as possible. A slight variant is less lazy but still works for fully-defined infinite inputs; I haven't tried to test which would be faster in practice.
bft' :: [a] -> Tree a
bft' xs = case bff 1 xs of
[] -> Empty
[t] -> t
_ -> error "whoops"
bff' :: Int -> [a] -> [Tree a]
bff' !_ [] = []
bff' n xs = combine n xs (bff (2 * n) (drop n xs))
where
-- The "take" portion of the splitAt in the original
-- bff is integrated into this version of combine. That
-- lets us avoid allocating an intermediate list we don't
-- really need.
combine :: Int -> [a] -> [Tree a] -> [Tree a]
combine 0 !_ ~[] = [] -- These two lazy patterns are just documentation
combine _k [] ~[] = []
combine k (y : ys) ts = Node y l r : combine (k - 1) ys dropped
where
(l, ~(r, dropped)) = case ts of -- This lazy pattern matters.
[] -> (Empty, (Empty, []))
t1 : ts' -> (t1, case ts' of
[] -> (Empty, [])
t2 : ts'' -> (t2, ts''))
For the less-lazy variant, replace (!l, ~(!r, dropped)) with (!l, !r, dropped) and adjust the RHS accordingly.
For true industrial strength, forests should be represented using lists strict in their elements:
data SL a = Cons !a (SL a) | Nil
And the pairs in the above (l, ~(r, dropped)) should both be represented using a type like
data LSP a b = LSP !a b
This should avoid some (pretty cheap) run-time checks. More importantly, it makes it easier to see where things are and aren't getting forced.
The method that you appear to have chosen is to build the tree up backwards: from bottom-to-top, right-to-left; starting from the last element of your list. This makes your buildBPT function look nice, but requires your insertElement to be overly complex. To construct a binary tree in a breadth-first fashion this way would require some difficult pivots at every step past the first three.
Adding 8 nodes to the tree would require the following steps (see how the nodes are inserted from last to first):
. 4
6 6
8 7 8 . .
. .
3
7 4 5
8 . 6 7 8 .
6 2
7 8 3 4
5 6 7 8
5
6 7 1
8 . . . 2 3
4 5 6 7
8 . . . . . . .
If, instead, you insert the nodes left-to-right, top-to-bottom, you end up with a much simpler solution, requiring no pivoting, but instead some tree structure introspection. See the insertion order; at all times, the existing values remain where they were:
. 1
2 3
1 4 5 . .
. .
1
1 2 3
2 . 4 5 6 .
1 1
2 3 2 3
4 5 6 7
1
2 3 1
4 . . . 2 3
4 5 6 7
8 . . . . . . .
The insertion step has an asymptotic time complexity on the order of O(n^2) where n is the number of nodes to insert, as you are inserting the nodes one-by-one, and then iterating the nodes already present in the tree.
As we insert left-to-right, the trick is to check whether the left sub-tree is complete:
if it is, and the right sub-tree is not complete, then recurse to the right.
if it is, and the right sub-tree is also complete, then recurse to the left (starting a new row).
if it is not, then recurse to the left.
Here is my (more generic) solution:
data Tree a = Leaf | Node a (Tree a) (Tree a)
deriving (Eq, Show)
main = do
let l1 = ["A1-Gate", "North-Region", "South-Region", "Convention Center",
"Rectorate", "Academic Building1", "Academic Building2"]
let l2 = [0.0, 0.5, 0.7, 0.3, 0.6, 1.2, 1.4, 1.2]
print $ treeFromList $ zip l1 l2
mkNode :: a -> Tree a
mkNode x = Node x Leaf Leaf
insertValue :: Tree a -> a -> Tree a
insertValue Leaf y = mkNode y
insertValue (Node x left right) y
| isComplete left && nodeCount left /= nodeCount right = Node x left (insertValue right y)
| otherwise = Node x (insertValue left y) right
where nodeCount Leaf = 0
nodeCount (Node _ left right) = 1 + nodeCount left + nodeCount right
depth Leaf = 0
depth (Node _ left right) = 1 + max (depth left) (depth right)
isComplete n = nodeCount n == 2 ^ (depth n) - 1
treeFromList :: (Show a) => [a] -> Tree a
treeFromList = foldl insertValue Leaf
EDIT: more detailed explanation:
The idea is to remember in what order you insert nodes: left-to-right first, then top-to-bottom. I compressed the different cases in the actual function, but you can expand them into three:
Is the left side complete? If not, then insert to the left side.
Is the right side as complete as the left side, which is complete? If not, then insert to the right side.
Both sides are full, so we start a new level by inserting to the left side.
Because the function fills the nodes up from left-to-right and top-to-bottom, then we always know (it's an invariant) that the left side must fill up before the right side, and that the left side can never be more than one level deeper than the right side (nor can it be shallower than the right side).
By following the growth of the second set of example trees, you can see how the values are inserted following this invariant. This is enough to describe the process recursively, so it extrapolates to a list of any size (the recursion is the magic).
Now, how do we determine whether a tree is 'complete'? Well, it is complete if it is perfectly balanced, or if – visually – its values form a triangle. As we are working with binary trees, then the base of the triangle (when filled) must have a number of values equal to a power of two. More specifically, it must have 2^(depth-1) values. Count for yourself in the examples:
depth = 1 -> base = 1: 2^(1-1) = 1
depth = 2 -> base = 2: 2^(2-1) = 2
depth = 3 -> base = 4: 2^(3-1) = 4
depth = 4 -> base = 8: 2^(4-1) = 8
The total number of nodes above the base is one less than the width of the base: 2^(n-1) - 1. The total number of nodes in the complete tree is therefore the number of nodes above the base, plus those of the base, so:
num nodes in complete tree = 2^(depth-1) - 1 + 2^(depth-1)
= 2 × 2^(depth-1) - 1
= 2^depth - 1
So now we can say that a tree is complete if it has exactly 2^depth - 1 non-empty nodes in it.
Because we go left-to-right, top-to-bottom, when the left side is complete, we move to the right, and when the right side is just as complete as the left side (meaning that it has the same number of nodes, which is means that it is also complete because of the invariant), then we know that the whole tree is complete, and therefore a new row must be added.
I originally had three special cases in there: when both nodes are empty, when the left node is empty (and therefore so was the right) and when the right node is empty (and therefore the left could not be). These three special cases are superseded by the final case with the guards:
If both sides are empty, then countNodes left == countNodes right, so therefore we add another row (to the left).
If the left side is empty, then both sides are empty (see previous point).
If the right side is empty, then the left side must have depth 1 and node count 1, meaning that it is complete, and 1 /= 0, so we add to the right side.

Determine if binary tree is BST haskell

I'm trying to write a bool function to return True if a binary tree is a bst using recursion, and I need a little guidance on haskell syntax.
I understand that for a binary tree to be a bst, the left subtree must always contain only nodes less than the head. and the right subtree must always contain only nodes greater than the head. I was structuring my function as such:
isBST :: Tree -> Bool --recieve Tree, return bool
isBST (Lead i) = True --return true if its only one leaf in tree
isBST (Node h l r) = if (((isBST l) < h) && ((isBST r) > h)) then True else False
--return true if left subtree < head AND right subtree > head
But this code results in the error:
Couldn't match expected type ‘Bool’ with actual type ‘Int’
Referring to the < h and > h parts specifically. Is it something wrong with my haskell formatting? Thanks in advance
Is it something wrong with my haskell formatting?
No, it is a semantical error. You write:
(isBST l) < h
So this means you ask Haskell to determine whether l is a binary search tree, which is True or False, but you can not compare True or False with h. Even if you could (some languages see True as 1 and False as 0), then it would still be incorrect, since we want to know whether all nodes in the left subtree are less than h.
So we will somehow need to define bounds. A way to do this is to pass parameters through the recursion and perform checks. A problem with this is that the root of the tree for example, has no bounds. We can fix this by using a Maybe Int is a boundary: if it is Nothing, the boundary is "inactive" so to speak, if it is Just b, then the boundary is "active" with value b.
In order to make this check more convenient, we can first write a way to check this:
checkBound :: (a -> a -> Bool) -> Maybe a -> a -> Bool
checkBound _ Nothing _ = True
checkBound f (Just b) x = f b x
So now we can make a "sandwich check" with:
sandwich :: Ord a => Maybe a -> Maybe a -> a -> Bool
sandwich low upp x = checkBound (<) low x && checkBound (>) upp x
So sandwich is given a lowerbound and an upperbound (both Maybe as), and a value, and checks the lower and upper bounds.
So we can write a function isBST' with:
isBST' :: Maybe Int -> Maybe Int -> Tree -> Bool
isBST' low upp ... = ....
There are two cases we need to take into account: the Leaf x case, in which the "sandwich constraint" should be satisfied, and the Node h l r case in which h should satisfy the "sandwich constraint" and furthermore l and r should satsify different sandwhich constraints. For the Leaf x it is thus like:
isBST' low upp (Leaf x) = sandwich low upp x
For the node case, we first check the same constraint, and then enforce a sandwich between low and h for the left part l, and a sandwich between h and upp for the right part r, so:
isBST' low upp (Node h l r) = sandwich low upp h &&
isBST' low jh l &&
isBST' jh upp r
where jh = Just h
Now the only problem we still have is to call isBST' with the root element: here we use Nothing as intial bounds, so:
isBST :: Tree -> Bool
isBST = isBST' Nothing Nothing
There are of course other ways to enforce constraints, like passing and updating functions, or by implement four variants of the isBST' function that check a subset of the constraints.
Martin, I'd recommend you to look at Willem's answer.
Another thing, you could also use your maxInt function that you asked in a previous question to define this function:
isBST (Node h l r) = ... (maxInt l) ... -- at some point we will need to use this
Taking your definition of BSTs:
I understand that for a binary tree to be a bst, the left subtree must
always contain only nodes less than the head. and the right subtree
must always contain only nodes greater than the head.
I'll add that also the subtrees of a node should be BSTs as well.
So we can define this requirement with:
isBST (Node h l r) =
((maxInt l) < h) -- the left subtree must contain nodes less than the head
&& ((minInt r) > h) -- the right must contain nodes greater than the head
&& (...) -- the left subtree should be a BST
&& (...) -- the right subtree should be a BST
Recall that you might need to define minInt :: Tree -> Int, as you probably know how to do that.
I like Willem Van Onsem's pedagogical approach in his answer.
I was going to delete my answer, but am going to post a "correction" instead, at the risk of being wrong again:
data Tree = Empty | Node Int Tree Tree deriving show
isBST :: Tree -> Bool
isBST Empty = True
isBST (Node h l r) = f (<=h) l && f (>=h) r && isBST l && isBST r
where
f _ Empty = True
f c (Node h l r) = c h && f c l && f c r
Note that I'm using Wikipedia's definition of BST, that
the key in each node must be greater than or equal to any key stored
in the left sub-tree, and less than or equal to any key stored in the
right sub-tree.

Reconstructing Huffman tree from (preorder) bitstring in Haskell

I have the following Haskell polymorphic data type:
data Tree a = Leaf Int a | Node Int (Tree a) (Tree a)
The tree will be compressed in a bitstring of 0s and 1s. A '0' signifies a Node and it is followed by the encoding of the left subtree, then the encoding of the right subtree. A '1' signifies a Leaf and is followed by 7 bits of information (for example it might be a char). Each node/leaf is supposed to also contain the frequency of the information stored, but this is not important for this problem (so we can put anything there).
For example, starting from this encoded tree
[0,0,0,1,1,1,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,
1,0,0,1,1,1,1,1,1,1,0,0,1,0,0,1,1,1,1,0,1,1,1,1,1,1,0,0,0,0,1]
it is supposed to give back something like this
Node 0 (Node 0 (Node 0 (Leaf 0 'k') (Leaf 0 't'))
(Node 0 (Node 0 (Leaf 0 'q') (Leaf 0 'g')) (Leaf 0 'r')))
(Node 0 (Leaf 0 'w') (Leaf 0 'a'))
(spacing is not important, but it did not fit on one line).
I have little experience working with trees, especially when implementing code. I have a vague idea about how I'd solve this on paper (using something similar to a stack to deal with the depth/levels) but I am still a bit lost.
Any help or ideas are appreciated!
Well, you're trying to parse a tree of bytes from a bit-stream. Parsing's one of those cases where it pays to set up some structure: we're going to write a miniature parser combinator library in the style of How to Replace Failure by a List of Successes, which will allow us to write our code in an idiomatic functional style and delegate a lot of the work to the machine.
Translating the old rhyme into the language of monad transformers, and reading "string" as "bit-string", we have
newtype Parser a = Parser (StateT [Bool] [] a)
deriving (Functor, Applicative, Monad, Alternative)
runParser :: Parser a -> [Bool] -> [(a, [Bool])]
runParser (Parser m) = runStateT m
A parser is a monadic computation which operates statefully on a stream of Booleans, yielding a collection of successfully-parsed as. GHC's GeneralizedNewtypeDeriving superpowers allow me to elide the boilerplate instances of Monad et al.
The goal, then, is to write a Parser (Tree SevenBits) - a parser which returns a tree of septuples of Booleans. (You can turn the 7 bits into a Word8 at your leisure by deriving a Functor instance for Tree and using fmap.) I'm going to use the following definition of Tree because it's simpler - I'm sure you can figure out how to adapt this code to your own ends.
data Tree a = Leaf a | Node (Tree a) (Tree a) deriving Show
type SevenBits = (Bool, Bool, Bool, Bool, Bool, Bool, Bool)
Here's a parser that attempts to consume a single bit from the input stream, failing if it's empty:
one :: Parser Bool
one = Parser $ do
stream <- get
case stream of
[] -> empty
(x:xs) -> put xs *> return x
Here's one which attempts to consume a particular bit from the input stream, failing if it doesn't match:
bit :: Bool -> Parser ()
bit b = do
i <- one
guard (i == b)
Here I'm pulling a sequence of seven Booleans from the input stream using replicateM and packing them into a tuple. We'll be using this to populate Leaf nodes' contents.
sevenBits :: Parser SevenBits
sevenBits = pack7 <$> replicateM 7 one
where pack7 [a,b,c,d,e,f,g] = (a, b, c, d, e, f, g)
Now we can finally write the code which parses the tree structure itself. We'll be choosing between the Node and Leaf alternatives using <|>.
tree :: Parser (Tree SevenBits)
tree = node <|> leaf
where node = bit False *> liftA2 Node tree tree
leaf = bit True *> fmap Leaf sevenBits
If node succeeds in parsing a low bit from the head of the stream, it continues to recursively parse the encoding of the left subtree followed by the right subtree, sequencing the applicative actions with liftA2. The trick is that node fails if it doesn't encounter a low bit at the head of the input stream, which tells <|> to give up on node and try leaf instead.
Note how the structure of tree reflects the structure of the Tree type itself. This is applicative parsing at work. We could alternately have structured this parser monadically, first using one to parse an arbitrary bit and then using a case analysis on the bit to determine whether we should continue to parse a pair of trees or a leaf. In my opinion this version is simpler, more declarative, and less verbose.
Also compare the clarity of this code to the low-level style of #behzad.nouri's foldr-based solution. Rather than building an explicit finite-state machine which switches between parsing nodes and leaves - an imperative-flavoured idea - my design allows you to declaratively describe the grammar to the machine using standard functions like liftA2 and <|> and trust that the abstractions will do the right thing.
Anyway, here I'm parsing a simple tree consisting of a pair of Leafs containing the (binary-encoded) numbers 0 and 1. As you can see, it returns the single successful parse and an empty stream of remaining bits.
ghci> runParser tree $ map (>0) [0, 1, 0,0,0,0,0,0,0, 1, 0,0,0,0,0,0,1]
[(Node (Leaf (False, False, False, False, False, False, False)) (Leaf (False, False, False, False, False, False, True)),[])]
Ok, here's a simple (ad-hoc, but easier to understand) way.
We need to buid a function parse, with the following type:
parse :: [Int] -> Tree Char
The approach you mentioned, with stacks, is the imperative one. Here we just lay on the recursive calls. The stack will be built by the compiler and it will just have each recursive call stored in it (At least you can imagine it that way, if you want, or just ignore all this paragraph).
So, the idea is the following: whenever you find a 0, you need to make two recursive calls to the algorithm. The first recursive call will read one branch (the left one) of the tree. The second one needs to be called with the rest of the list as argument. The rest left by the first recursive call. So, we need a auxiliar function parse' with the following type (now we return a pair, being the second value the rest of list):
parse' :: [Int] -> (Tree Char, [Int])
Next, you can see a piece of code where the 0 case is just as described before.
For the 1 case, we just need to take the next 7 numbers and make them into a char somehow (I leave the definition of toChar for you), then, just return a Leaf and the rest of the list.
parse' (0:xs) = let (l, xs') = parse' xs
(r, xs'') = parse' xs' in (Node 0 l r, xs'') --xs'' should be []
parse' (1:xs) = let w = toChar (take 7 xs) in (Leaf 0 w , drop 7 xs)
Finally, our parse function just calls the auxiliary parse one and returns the first element of the pair.
parse xs = fst $ parse' xs
do a right fold:
import Data.Char (chr)
data Tree a = Leaf a | Node (Tree a) (Tree a)
deriving Show
build :: [Int] -> [Tree Char]
build xs = foldr go (\_ _ -> []) xs 0 0
where
nil = Leaf '?'
go 0 run 0 0 = case run 0 0 of
[] -> Node nil nil:[]
x:[] -> Node x nil:[]
x:y:zs -> Node x y :zs
go 1 run 0 0 = run 0 1
go _ _ _ 0 = error "this should not happen!"
go x run v 7 = (Leaf $ chr (v * 2 + x)): run 0 0
go x run v k = run (v * 2 + x) (k + 1)
then:
\> head $ build [0,0,0,1,1,1,0, ...] -- the list of 01s as in the question
Node (Node (Node (Leaf 'k') (Leaf 't'))
(Node (Node (Leaf 'q') (Leaf 'g')) (Leaf 'r')))
(Node (Leaf 'w') (Leaf 'a'))

Splitting a BinTree with tail recursion in Haskell

So this week we learned about union types, tail recursion and binary trees in Haskell. We defined our tree data type like so:
data BinTree a = Empty
| Node (BinTree a) a (BinTree a)
deriving (Eq, Show)
leaf :: a -> BinTree a
leaf x = Node Empty x Empty
Now we were asked to write a function to find the most left node, return it, cut it out and also return the remaining tree without the node we just cut.
We did something like this, which worked quite well:
splitleftmost :: BinTree a -> Maybe (a, BinTree a)
splitleftmost Empty = Nothing
splitleftmost (Node l a r) = case splitleftmost l of
Nothing -> Just (a, r)
Just (a',l') -> Just (a', Node l' a r)
Now I need to make this function tail recursive. I think I understood what tail recursion is about, but found it hard to apply it to this problem. I was told to write a function which calls the main function with the fitting arguments, but was still not able to solve this.
Since nodes do not have a parent link, one approach would be to maintain root-to-leaf path within a list. At the end the modified tree can be constructed using a left fold:
slm :: BinTree a -> Maybe (a, BinTree a)
slm = run []
where
run _ Empty = Nothing
run t (Node Empty x r) = Just (x, foldl go r t)
where go l (Node _ x r) = Node l x r
run t n#(Node l _ _) = run (n:t) l
As others have hinted, there is no reason, in Haskell, to make this function tail-recursive. In fact, a tail-recursive solution will almost certainly be slower than the one you have devised! The main potential inefficiencies in the code you've provided involve allocation of pair and Just constructors. I believe GHC (with optimization enabled) will be able to figure out how to avoid these. My guess is that its ultimate code will probably look something like this:
splitleftmost :: BinTree a -> Maybe (a, BinTree a)
splitleftmost Empty = Nothing
splitleftmost (Node l a r) =
case slm l a r of
(# hd, tl #) -> Just (hd, tl)
slm :: BinTree a -> a -> BinTree a
-> (# a, BinTree a #)
slm Empty a r = (# a, r #)
slm (Node ll la lr) a r =
case slm ll la lr of
(# hd, tl' #) -> (# hd, Node tl' a r #)
Those funny-looking (# ..., ... #) things are unboxed pairs, which are handled pretty much like multiple return values. In particular, no actual tuple constructor is allocated until the end. By recognizing that every invocation of splitleftmost with a non-empty tree will produce a Just result, we (and thus almost certainly GHC) can separate the empty case from the rest to avoid allocating intermediate Just constructors. So this final code only allocates stack frames to handle the recursive results. Since some representation of such a stack is inherently necessary to solve this problem, using GHC's built-in one seems pretty likely to give the best results.
Here, not to spoil anything, are some "tail recursive" definitions of functions for summing along the left and right branches, at least as I understand "tail recursion":
sumLeftBranch tree = loop 0 tree where
loop n Empty = n
loop n (Node l a r) = loop (n+a) l
sumRightBranch tree = loop 0 tree where
loop n Empty = n
loop n (Node l a r) = loop (n+a) r
You can see that all the recursive uses of loop will have the same answer as the first call loop 0 tree - the arguments just keep getting put into better and better shape, til they are in the ideal shape, loop n Empty, which is n, the desired sum.
If this is the kind of thing that is wanted, the setup for splitleftmost would be
splitLeftMost tree = loop Nothing tree
where
loop m Empty = m
loop Nothing (Node l a r) = loop ? ?
loop (Just (a',r')) (Node l a r) = loop ? ?
Here, the first use of loop is in the form of loop Nothing tree, but that's the same as loop result Empty - when we come to it, namely result. It took me a couple of tries to get the missing arguments to loop ? ? right, but, as usual, they were obvious once I got them.

Working with Trees in Haskell

I have this data definition for a tree:
data Tree = Leaf Int | Node Tree Int Tree
and I have to make a function, nSatisfy, to check how many items of the tree check some predicate.
Here's what I've done:
nSatisfy :: (Int->Bool) -> Tree -> Int
nSatisfy _ Leaf = 0
nSatisfy y (Node left x right)
|y x = 1 + nSatisfy y (Node left x right)
| otherwise = nSatisfy y (Node left x right)
Is this the right way to solve this problem?
In your nSatisfy function, you should add the number of nodes satisfying the condition in both subtrees with two recursive calls. The last two lines should be like this:
|x y=1+(nSatisfy y left)+(nSatisfy y right)
|otherwise=(nSatisfy y left)+(nSatisfy y right)
This way, it will call itself again on the same node but only on the subtrees.
Also, if a leaf contains an integer, as is implied in the data declaration, you should make it evaluate the condition for a leaf and return 1 if it is true, instead of always returning 0.
In addition to the main answer, I'd like to offer a slightly different way how to generalize your problem and solving it using existing libraries.
The operation you're seeking is common to many data structures - to go through all elements and perform some operation on them. Haskell defines Foldable type-class, which can be implemented by structures like yours.
First let's import some modules we'll need:
import Data.Foldable
import Data.Monoid
In order to use Foldable, we need to generalize the structure a bit, in particular parametrize its content:
data Tree a = Leaf a | Node (Tree a) a (Tree a)
In many cases this is a good idea as it separates the structure from its content and allows it to be easily reused.
Now let's define its Foldable instance. For tree-like structures it's easier to define it using foldMap, which maps each element into a monoid and then combines all values:
instance Foldable Tree where
foldMap f (Leaf x) = f x
foldMap f (Node lt x rt) = foldMap f lt <> f x <> foldMap f rt
This immediately gives us the whole library of functions in the Data.Foldable module, such as searching for an element, different kinds of folds, etc. While a function counting the number of values satisfying some predicate isn't defined there, we can easily define it for any Foldable. The idea is that we'll use the Sum:
nSatisfy :: (Foldable f) => (a -> Bool) -> f a -> Int
nSatisfy p = getSum . foldMap (\x -> Sum $ if p x then 1 else 0)
The idea behind this function is simple: Map each value to 1 if it satisfies the predicate, otherwise to 0. And then folding with the Sum monoid just adds all values up.

Resources