How do you implement "show" tail-recursively? - haskell

Haskell's show is usually implemented recursively as:
data Tree = Node Tree Tree | Leaf
show' (Node left right) = "(Node " ++ show' left ++ " " ++ show' right ++ ")"
show' Leaf = "Leaf"
main = putStrLn $ show' (Node (Node Leaf Leaf) Leaf)
How can you implement show using tail recursion?

Using CPS, as mentioned by M. Shaw.
show' :: Tree -> String
show' = \tree -> go tree id
where
go (Node left right) c =
go left (\l -> go right (\r -> c ("(Node " ++ l ++ " " ++ r ++ ")")))
go Leaf c = c "Leaf"
It's important to keep in mind that Haskell's laziness obviates the need for tail recursion in many cases. In this case, the tail recursive version has to traverse the entire tree before any part of the input can be returned. Try showing an infinite tree with each version. When returning a lazy structure such as String in a lazy language, it is better to be corecursive (which your original implementation is) than tail recursive.
Most of the time of this function will be eaten up by left-nested (++) calls, which is O(n) in its left argument. The solution is to use a difference list, which is a form of continuation-passing style itself.
See also Continuation-Based Program Transformation Strategies, which talks about how to arrive at efficient algorithms by converting to CPS, then transforming the structure of the continuations into something more concrete to arrive at a tail-recursive solution.

You can turn this tail-recursive via Continuation passing style. Have a look the examples on the wiki page.

Related

Tree traversal inorder tail recursion

Did I implement inorder level-order tree transversal using tail-recursion correctly?
inorder (Leaf n) temp = n:temp
inorder (Node (n, left, right)) temp = inorder left (n:inorder right temp)
inorder :: Tree a -> [a] -> [a]
Tree is declared as
data Tree a = Leaf a | Node (a, Tree a, Tree a) deriving Show
and returns
[2,1,3] on call inorder three [] where three = Node (1, Leaf 2, Leaf 3)
This technically isn't tail recursive because you have a recursive call inorder right temp in a nontail position. One way to fix this would be with continuations. You write a function which takes an accumulator like before, but rather than the accumulator being just a list it's actually a function representing the work left to do in the computation. This means that instead of making a non-tail call and just returning, we can always tail call because the context we need is saved to the continuation.
inorder = go id
where go :: ([a] -> r) -> Tree a -> r
go k Leaf = k []
go k (Node a l r) = go l (\ls -> go r (\rs -> k $ ls ++ n : rs))
Here every call is a tail call as required but it's quite innefficient because it requires a ++ operation at every level, pushing us into quadratic costs. A more efficient algorithm would avoid building up an explicit list and instead build up a difference list, delaying the construction on the concrete structure and giving a more efficient algorithm
type Diff a = [a] -> [a] -- A difference list is just a function
nil :: Diff a
nil xs = xs
cons :: a -> Diff a -> Diff a
cons a d = (:) a . d
append :: Diff a -> Diff a -> Diff a
append xs ys = xs . ys
toList :: Diff a -> a
toList xs = xs []
Note that all of these operations are O(1) except for toList which is O(n) in the number of entries. The important point here is that diff lists are cheap and easy to append so we'll construct these in our algorithm and construct the concrete list at the very end
inorder = go toList
where go :: (Diff a -> r) -> Tree a -> r
go k Leaf = k nil
go k (Node a l r) =
go l (\ls -> go r (\rs -> k $ ls `append` cons n rs))
And now, through gratuitous application of functions we've gotten a completely unidiomatic Haskell program. You see in Haskell we don't really care about tail calls because we generally want to handle infinite structures correctly and that's not really possible if we demand everything be tail recursive. In fact, I would say that while not tail recursive, the code you originally had is the most idiomatic, that's even how it's implemented in Data.Set! It has the property that we can lazily consume the result of that toList and it will work with us and lazily process the tree. So in your implementation, something like
min :: Tree a -> a
min = listToMaybe . toList
is going to be pretty darn close to how you would implement it by hand efficiency wise! It will not construct traverse the whole tree first like my version will have to. These sort of compositional effects of laziness pay more dividends in real Haskell code than syntactically making our code use only tail calls (which does nothing to actually guarantee space usage anyways).

TreeToList in Haskell function

I have this function that i wrote to pass from a tree to a list in Haskell:
treeToList :: Tree -> [Int]
treeToList (Leaf x) = [x]
treeToList (Node left x right) = treeToList left ++ [x] ++ treeToList right
This works just fine, however i have a doubt:
With the input:
treeToList (Node (Leaf 1) 2 (Node (Leaf 3) 4 (Leaf 5)))
the function produces this list:
[1,2,3,4,5]
which, i think is wrong, because if i want to go the other way around, and write the tree from the list, i'm going to write it wrong.
How can i fix this ?
This is a correct way to transform a binary tree of Int into a list by "in-order" traversal. This is the most common way to turn a search tree into a list. There are, however, other valid ways to traverse trees which will produce lists in different orders for various purposes, as you can see on Wikipedia. As András Kovács and chi have indicated, there is generally no way to go backwards, unless you have specific information about the shape of the tree you wish to construct.

Haskell convert "case of" statement to tail recursive code

I need help.
I have a function of that form
myFunction = case myFunction of
(Nothing) -> (Just)
(Just) -> (Just)
I want to make it tail recursive.
How would one do it ?
I understand that the fact we have a different statement according to the return of the recursive call makes it difficult (reason I need help ^^).
I can give the original function, but I'm rather looking for a more general solution.
Thanks in advance
Edit: Actual code:
myFunction :: MyTree x -> (x, Maybe(MyTree x))
myFunction = (x, Nothing)
myFunction (MyNode left right) = case myFunction left of
(x, Nothing) -> (x, Just right)
(x, Just left2) -> (x, Just (Node left2 right))
I'll assume you defined
data MyTree x = MyLeaf x | MyNode (MyTree x) (MyTree x)
and meant
myFunction :: MyTree x -> (x, Maybe(MyTree x))
myFunction (MyLeaf x) = (x, Nothing)
myFunction (MyNode left right) = case myFunction left of
(x, Nothing) -> (x, Just right)
(x, Just left2) -> (x, Just (MyNode left2 right))
Which is a function that pulls out the leftmost leaf and sews the corresponding right branch in where it was.
You ask how to make this tail recursive. Why is that? In some (strict) languages, tail recursive code is more efficient, but Haskell uses lazy evaluation, which means it doesn't matter how late the recursive calls happen, but rather how early they produce output. In this case, the head recursive case myFunction left of zooms right down the tree until it finds that leftmost leaf, you can't get to it any quicker. However, on the way back up, it does pass the x around a bit rather than returning immediately, but it also sews all the right branches back on at the appropriate plave without any bookkeeping, which is the joy of using recursion on a recursive data structure.
See this question about why tail recursion isn't the most important thing for efficiency in Haskell.
Three classic things to do to a binary tree with data at the nodes are:
1. pre-order traversal (visit the current node first then the left subtree then right) - doubly tail recursive
2. in-order traversal (visit left subtree, then current node, then right) - head and tail recursive
3. post-order traversal (visit left and right subtrees before the current node) - doubly head recursive.
Post order sounds worryingly head recursive to someone not used to lazy evaluation, but it's an efficient way to sum the values in your tree, for example, particularly if you make sure the compiler knows it's strict.
As always, the best algorithms give the fastest results, and you should compile with -O2 if you want optimisations turned on.
These have to match.
myFunction = ...
myFunction (MyNode left right) = ...
They don't match. You can't use them together. Why? One of them takes zero arguments, the other one takes one argument. They must take the same number of arguments. If you need to ignore an argument, use _. Note that the version that uses _ has to be after the version that doesn't use _.
myFunction :: MyTree x -> Maybe (MyTree x)
myFunction (MyNode left right) =
case myFunction left of
Nothing -> Just right
Just left2 -> Just (MyNode left2 right)
myFunction _ = Nothing
I don't know what x is supposed to be in the body of your function. It's not bound to anything.
This isn't tail recursive. Not all functions can be made into tail recursive functions.
Hint
Maybe if you described what the function was supposed to do, we could help you do that.

Non-exhaustive patterns in function internals

I am currently working on Problem 62
I have tried the following code to solve it:
data Tree a = Empty | Branch a (Tree a) (Tree a)
deriving (Show, Eq)
internals :: Tree a -> [a]
internals (Branch a Empty Empty) = []
internals (Branch a b c) = [a]++(internals b)++(internals c)
internals (Branch a b Empty) = [a]++(internals b)
internals (Branch a Empty c) = [a]++(internals c)
Which basically says:
If both the children are empty don't include that list element in the list of internals.
If both children are non-empty, that node (a) is an internal include it, and keep checking to see any of a's children are also internal.
If one of the children is non-empty, that node is internal, and recursively keep checking if the child is also an internal node.
In GHCi I have ran the following:
> let tree4 = Branch 1 (Branch 2 Empty (Branch 4 Empty Empty)) (Branch 2 Empty Empty)
> internals tree4
and get the following runtime error:
[1,2*** Exception: Untitled.hs:(6,1)-(12,49): Non-exhaustive patterns in function internals
I don't understand why this thing is non-exhaustive, I thought it would go to branch 1, notice it's children are non-empty, then go down both branch 2s and find out one branch is empty, one is not, stop at the one that is, and keep going down the one that isn't, until branch "4", and end it there. It sort of does, I do get 1, 2 in the list, but why is it not exhaustive?
Thanks in advanced.
Thank you for the help Tikhon changed my function to this:
data Tree a = Empty | Branch a (Tree a) (Tree a)
deriving (Show, Eq)
internals :: Tree a -> [a]
internals (Branch a Empty Empty) = []
internals (Branch a b Empty) = [a]++(internals b)
internals (Branch a Empty c) = [a]++(internals c)
internals (Branch a b c) = [a]++(internals b)++(internals c)
The other answer doesn't actually solve the reason for the error message, it does resolve one problem though (the fact that the order of patterns is significant).
The error message is Non-exhaustive patterns, which means that internals is being called with a value that doesn't match any of the patterns (this value is Empty). As Tikhon said, it is because Branch a b c matches all Branches, so the later patterns are never used and an Empty can slip through. We can see what happens if we trace the execution of internals (Branch 1 (Branch 2 Empty Empty) Empty) (assume strict-ish evaluation, it makes the exposition simpler):
internals (Branch 1 (Branch 2 Empty Empty) Empty) =>
[1] ++ internals (Branch 2 Empty Empty) ++ internals Empty =>
[1] ++ [] ++ internals Empty =>
[1] ++ internals Empty =>
[1] ++ ???
The proper fix will mean that can't happen, i.e. one that converts internals from a partial function (undefined for some input values) to a total function (defined for all input). Total functions are much much nicer than partial ones, especially in Haskell, where the type system gives the programmer the ability to mark "partial" functions as such at compile time (e.g. via Maybe or Either).
We can think about the recursion from the bottom-up, i.e. work out the base cases:
the empty tree has no internal nodes
a tree that is a single node has no internal nodes
We recur on any tree that doesn't satisfy either of these; in which case, the current node is an internal node (so add that to the list), and there might be internal nodes in the children, so check them too.
We can express this in Haskell:
internals :: Tree a -> [a]
internals Empty = []
internals (Branch a Empty Empty) = []
internals (Branch a b c) = [a] ++ internals b ++ internals c
This has the added bonus of making the code neater and shorter: we don't have to worry about the details of the children in the recursion, there is a base case that handles any Emptys.
The order of patterns matters. Since Branch a b c matches everything that isn't just Empty, including something like Branch a b Empty, your third and fourth cases never get hit.
This should fix it:
internals :: Tree a -> [a]
internals (Branch a Empty Empty) = []
internals (Branch a b Empty) = [a] ++ internals b
internals (Branch a Empty c) = [a] ++ internals c
internals (Branch a b c) = [a] ++ internals b ++ internals c

Lazy tree with a space leak

I'm writing a program trying to implement a toy XML processor. Right now the program is supposed to read a stream of events (think SAX) describing the structure of a document and to build lazily the corresponding tree.
The events are defined by the following datatype:
data Event = Open String
| Close
A possible input would then be:
[Open "a", Open "b", Close, Open "c", Close, Close]
that would correspond to the tree:
a
/ \
b c
I would like to generate the tree in a lazy way, so that it does not need to be present in memory in full form at any time. My current implementation, however, seems to have a space leak causing all the nodes to be retained even when they are no longer needed. Here is the code:
data Event = Open String
| Close
data Tree a = Tree a (Trees a)
type Trees a = [Tree a]
data Node = Node String
trees [] = []
trees (Open x : es) =
let (children, rest) = splitStream es
in (Tree (Node x) (trees children)) : (trees rest)
splitStream es = scan 1 es
scan depth (s#(Open {}) : ss) =
let (b, a) = scan (depth+1) ss
in (s:b, a)
scan depth (s#Close : ss) =
case depth of
1 -> ([], ss)
x -> let (b, a) = scan (depth-1) ss
in (s:b, a)
getChildren = concatMap loop
where
loop (Tree _ cs) = cs
main = print .
length .
getChildren .
trees $
[Open "a"] ++ (concat . replicate 1000000 $ [Open "b", Close]) ++ [Close]
The function trees converts the list of events into a list of Tree Node. getChildren collects all the children nodes (labeled "b") of the root ("a"). These are then counted and the resulting number is printed.
The compiled program, built with GHC 7.0.4 (-O2), keeps increasing its memory usage up to the point when it prints the node count. I was expecting, on the other hand, an almost constant memory usage.
Looking at the "-hd" heap profile, it is clear that most of the memory is taken by the list constructor (:). It seems like one of the lists produced by scan or by trees is retained in full. I don't understand why, however, as length . getChildren should get rid of child nodes as soon as they are traversed.
Is there a way to fix such space leak?
I suspect that trees is the evil guy. As John L said this is probably an instance of the Wadler Space Leak in which the compiler is unable to apply the optimization that prevents the leak. The problem is that you use a lazy pattern matching (the let expression) to deconstruct the pair and perform pattern matching via the application of trees on one of the components of the tuple. I had a quite similar problem once http://comments.gmane.org/gmane.comp.lang.haskell.glasgow.user/19129. This thread also provides a more detailed explanation. To prevent the space leak you can simply use a case expression to deconstruct the tuple as follows.
trees [] = []
trees (Open x : es) =
case splitStream es of
(children, rest) -> Tree (Node x) (trees children) : trees rest
With this implementation the maximum residency drops from 38MB to 28KB.
But note that this new implementation of trees is more strict than the original one as it demands the application of splitStream. Therefore, in some cases this transformation might even cause a space leak. To regain a less strict implementation you might use a similar trick as the lines function in Data.List which causes a similar problem http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Data-List.html#lines. In this case trees would look as follows.
trees [] = []
trees (Open x : es) =
context (case splitStream es of
(children, rest) -> (trees children, trees rest))
where
context ~(children', rest') = Tree (Node x) children' : rest'
If we desugar the lazy pattern matching we get the following implementation. Here the compiler is able to detect the selector to the tuple component as we do not perform pattern matching on one of the components.
trees [] = []
trees (Open x : es) = Tree (Node x) children' : rest'
where
(children', rest') =
case splitStream es of
(children, rest) -> (trees children, trees rest)
Does anybody know whether this transformation always does the trick?
I strongly suspect this is an example of the "Wadler space leak" bug. Unfortunately I don't know how to solve it, but I did find a few things that mitigate the effects somewhat:
1) Change getChildren to
getChildren' = ($ []) . foldl (\ xsf (Tree _ cs) -> xsf . (cs ++)) id
This is a small, but noticeable, improvement.
2) In this example trees always outputs a single-element list. If this is always true for your data, explicitly dropping the rest of the list fixes the space leak:
main = print .
length .
getChildren .
(:[]) .
head .
trees

Resources