First of all I have two different implementation that I believe are correct, and have profiled them and thinking they are about of the same performance:
depth::Tree a -> Int
depth Empty = 0
depth (Branch b l r) = 1 + max (depth l) (depth r)
depthTailRec::Tree a -> Int
depthTailRec = depthTR 0 where
depthTR d Empty = d
depthTR d (Branch b l r) = let dl = depthTR (d+1) l; dr = depthTR (d+1) r in max dl dr
I was just wondering aren't people are talking about how tail recursion can be beneficial for performance? And a lot of questions are jumping into my head:
How can you make the depth function faster?
I read about something about how Haskell's laziness can reduce the need of tail recursion, is that true?
Is it the truth that every recursion can be converted into tail recursion?
Finally tail recursion can be faster and space efficient because it can be turned into loops and thus reduce the need to push and pop the stack, is my understanding right?
1. Why isn't your function tail recursive?
For a recursive function to be tail recursive, all the recursive calls must be in tail position. A function is in tail position if it is the last thing to be called before the function returns. In your first example you have
depth (Branch _ l r) = 1 + max (depth l) (depth r)
which is equivalent to
depth (Branch _ l r) = (+) 1 (max (depth l) (depth r))
The last function called before the function returns is (+), so this is not tail recursive. In your second example you have
depthTR d (Branch _ l r) = let dl = depthTR (d+1) l
dr = depthTR (d+1) r
in max dl dr
which is equivalent to (once you've re-lambdified all the let statements) and re-arranged a bit
depthTR d (Branch _ l r) = max (depthTR (d+1) r) (depthTR (d+1) l)
Now the last function called before returning is max, which means that this is not tail recursive either.
2. How could you make it tail recursive?
You can make a tail recursive function using continuation-passing style. Instead of re-writing your function to take a state or an accumulator, you pass in a function (called the continuation) that is an instruction for what to do with the value computed -- i.e. instead of immediately returning to the caller, you pass whatever value you have computed to the continuation. It's an easy trick for turning any function into a tail-recursive function -- even functions that need to call themselves multiple times, as depth does. It looks something like this
depth t = go t id
where
go Empty k = k 0
go (Branch _ l r) k = go l $ \dl ->
go r $ \dr ->
k (1 + max dl dr)
Now you see that the last function called in go before it returns is itself go, so this function is tail recursive.
3. Is that it, then?
(NB this section draws from the answers to this previous question.)
No! This "trick" only pushes the problem back somewhere else. Instead of a non-tail recursive function that uses lots of stack space, we now have a tail-recursive function that eats thunks (unapplied functions) which could potentially be taking up a lot of space themselves. Fortunately, we don't need to work with arbitrary functions - in fact, there are only three kinds
\dl -> go r (\dr -> k (1 + max dl dr)) (which uses the free variables r and k)
\dr -> k (1 + max dl dr) (with free variables k and dl)
id (with no free variables)
Since there are only a finite number of functions, we can represent them as data
data Fun a = FunL (Tree a) (Fun a) -- the fields are 'r' and 'k'
| FunR Int (Fun a) -- the fields are 'dl' and 'k'
| FunId
We'll have to write a function eval as well, which tells us how to evaluate these "functions" at particular arguments. Now you can re-write the function as
depth t = go t FunId
where
go Empty k = eval k 0
go (Branch _ l r) k = go l (FunL r k)
eval (FunL r k) d = go r (FunR d k)
eval (FunR dl k) d = eval k (1 + max dl d)
eval (FunId) d = d
Note that both go and eval have calls to either go or eval in tail position -- therefore they are a pair of mutually tail recursive functions. So we've transformed the version of the function that used continuation-passing style into a function that uses data to represent continuations, and uses a pair of mutually recursive functions to interpret that data.
4. That sounds really complicated
Well, I guess it is. But wait! We can simplify it! If you look at the Fun a data type, you'll see that it's actually just a list, where each element is either a Tree a that we're going to compute the depth of, or it's an Int representing a depth that we've computed so far.
What's the benefit of noticing this? Well, this list actually represents the call stack of the chain of continuations from the previous section. Pushing a new item onto the list is pushing a new argument onto the call stack! So you could write
depth t = go t []
where
go Empty k = eval k 0
go (Branch _ l r) k = go l (Left r : k)
eval (Left r : k) d = go r (Right d : k)
eval (Right dl : k) d = eval k (1 + max dl d)
eval [] d = d
Each new argument you push onto the call stack is of type Either (Tree a) Int, and as the functions recurse, they keep pushing new arguments onto the stack, which are either new trees to be explored (whenever go is called) or the maximum depth found so far (whenever eval is called).
This call strategy represents a depth-first traversal of the tree, as you can see by the fact that the left tree is always explored first by go, while the right tree is always pushed onto the call stack to be explored later. Arguments are only ever popped off the call stack (in eval) when an Empty branch has been reached and can be discarded.
5. Alright... anything else?
Well, once you've noticed that you can turn the continuation-passing algorithm into a version that mimics the call stack and traverses the tree depth first, you might start to wonder whether there's a simpler algorithm that traverses the tree depth first, keeping track of the maximum depth encountered so far.
And indeed, there is. The trick is to keep a list of branches that you haven't yet explored, together with their depths, and keep track of the maximum depth you've seen so far. It looks like this
depth t = go 0 [(0,t)]
where
go depth [] = depth
go depth (t:ts) = case t of
(d, Empty) -> go (max depth d) ts
(d, Branch _ l r) -> go (max depth d) ((d+1,l):(d+1,r):ts)
I think that's about as simple as I can make this function within the constraints of ensuring that it's tail-recursive.
6. So that's what I should use?
To be honest, your original, non tail-recursive version is probably fine. The new versions aren't any more space efficient (they always have to store the list of trees that you're going to process next) but they do have the advantage of storing the trees to be processed next on the heap, rather than on the stack - and there's lots more space on the heap.
You might want to look at the partially tail-recursive function in Ingo's answer, which will help in the case when your trees are extremely unbalanced.
A partially tail recursive version would be this:
depth d Empty = d
depth d (Branch _ l Empty) = depth (d+1) l
depth d (Branch _ Empty r) = depth (d+1) r
depth d (Branch _ l r) = max (depth (d+1) l) (depth (d+1) r)
Note that tail rescursion in this case (as opposed to the more complex full case in Chris' answer) is done only to skip the incomplete branches.
But this should be enough under the assumption that the depth of your trees is at most some double digit number. In fact, if you properly balance your tree, this should be fine. If your trees, OTOH, use to degenerate into lists, then this already will help to avoid stack overflow (this is a hypothesis I haven't proved, but it is certainly true for a totally degenerated tree that has no branch with 2 non empty children.).
Tail recursion is not a virtue in and of itself. It is only then important if we do not want to explode the stack with what would be a simple loop in imperative programming languages.
to your 3., yes, e.g. by use of CPS technique (as shown in Chris's answer);
to your 4., correct.
to your 2., with lazy corecursive breadth-first tree traversal we naturally get a solution similar to Chris's last (i.e. his #5., depth-first traversal with explicated stack), even without any calls to max:
treedepth :: Tree a -> Int
treedepth tree = fst $ last queue
where
queue = (0,tree) : gen 1 queue
gen 0 p = []
gen len ((d,Empty) : p) = gen (len-1) p
gen len ((d,Branch _ l r) : p) = (d+1,l) : (d+1,r) : gen (len+1) p
Though both variants have space complexity of O(n) in the worst case, the worst cases themselves are different, and opposite to each other: the most degenerate trees are the worst case for depth-first traversal (DFT) and the best case (space-wise) for breadth-first (BFT); and similarly the most balanced trees are the best case for DFT and the worst for BFT.
Related
Here is some code deciding whether a list is a palindrome in n+1 comparisons, in "direct style"
pal_d1 :: Eq a => [a] -> Bool
pal_d1 l = let (r,_) = walk l l in r
where walk l [] = (True,l)
walk l (_:[]) = (True,tail l)
walk (x:l) (_:_:xs) = let (r, y:ys) = walk l xs
in (r && x == y, ys)
which can be tested on a few example
-- >>> pal_d1 [1,2,1]
-- True
-- >>> pal_d1 [1,2,2,1]
-- True
-- >>> pal_d1 [1,2,3,4,2,1]
-- False
Danvy claims in "There and back again" there is no direct style solution without a control operator (right before 4.2) due to the non linear use of the continuation in CPS style solution below :
pal_cps1 :: Eq a => [a] -> Bool
pal_cps1 l = walk l l (\_ -> trace "called" True)
where
walk l [] k = k l
walk l (_:[]) k = k (tail l)
walk (x:xs) (_:_:ys) k = walk xs ys (\(r:rs) -> x == r && k rs)
How is the first code not contradicting this assertion ?
(and how is the continuation not used linearly ?)
He does not claim that there is no solution without a control operator.
The continuation is not used linearly and therefore mapping this program back to direct style requires a control operator.
The context of the paper is to study systematic transformations between direct style and CPS, and the claim of that paragraph is that going back from CPS is tricky if the continuation is used in fancy ways.
With some effort you can wrangle it back into a nice shape, but the question remains, how might a compiler do that automatically?
(and how is the continuation not used linearly ?)
In the paper, the continuation is on the right of andalso (&&) so it's discarded if the left operand is False.
In operational semantics, you can view the continuation as an evaluation context, and in that view discarding the continuation corresponds to throwing an exception. One can certainly do it, but the point is that this requires extra machinery in the source language.
The CPS code (in the question's original version --- since edited by OP) seems faulty. Looks like it should be
walk (x:xs) (_:_:ys) k = walk xs ys (\(z:zs) -> x == z && k zs)
The non-CPS code starts the comparisons from the middle, and does n `div` 2 comparisons, for a list of length n. It continues testing even if a mismatch is discovered, so, is "linear".
The CPS code exits right away in such a case because (False && undefined) == False holds; so is "non-linear". The two are not equivalent so the first doesn't say anything about the second.
As the other answer is saying, not calling the continuation amounts to throwing an exception in a code without continuations, what the paper's author apparently calls "the direct [i.e., non-CPS(?) --wn] style".
(I haven't read the paper).
It isn't difficult at all to code the early-exiting solution in the "direct" style, by the way. We would just use the same turtle-and-hare trick to discover the halves while also building the first half in reverse, and then call and $ zipWith (==) first_half_reversed second_half in Haskell, or its equivalent short-circuiting direct recursive variant, in a strict language like e.g. Scheme.
This code from the answer to this question copied below quite nicely takes only O(n) space to do a depth first traversal of a tree of depth n which contains O(2^n) nodes. This is very good, the garbage collector seems to be doing a good job of cleaning up the already processed tree.
But my question I have is, how? Unlike a list, where once we process the first element we can completely forget it, we can't scrap the root node after processing the first leaf node. We have to wait until the left half the tree is processed (because eventually we'll have to traverse down the right from the root). Also, as the root node points to the nodes below it, and so on, all the way down to the leaves, which would seem to imply that we wouldn't be able to collect any of the first half of a tree until we start on the second half (as all those nodes will still have references to them starting from the still live root node). This fortunately is not the case, but could someone explain how?
import Data.List (foldl')
data Tree = Tree Int Tree Tree
tree n = Tree n (tree (2 * n)) (tree (2 * n + 1))
treeOne = tree 1
depthNTree n t = go n t [] where
go 0 (Tree x _ _) = (x:)
go n (Tree _ l r) = go (n - 1) l . go (n - 1) r
main = do
x <- getLine
print . foldl' (+) 0 . filter (\x -> x `rem` 5 == 0) $ depthNTree (read x) treeOne
Actually you don't hold on to the root while you descend the left subtree.
go n (Tree _ l r) = go (n - 1) l . go (n - 1) r
So the root is turned two thunks, composed together. One holds a reference to the left subtree, the other holds a reference to the right subtree. The root node itself is now garbage.
The left and right subtrees themselves are just thunks, because the tree is produces lazily, so they aren't consuming much space yet.
We're only evaluating go n (Tree _ l r) because we're evaluating depthNTree n t, which is go n t []. So we're immediately forcing the two composed go calls we just turned the root into:
(go (n - 1) l . go (n - 1) r) []
= (go (n - 1) l) ((go (n - 1) r) [])
And because this is lazily evaluated, we do the outermost call first, leaving ((go (n - 1) r) []) as a thunk (and so not generating any more of r).
Recursing into go will force l, so we do generate more of that. But then we do the same thing again one level down; again that tree node becomes garbage immediately, we generate two thunks holding the left and right sub sub trees, and then we force only the left one.
After n calls we'll be evaluating go 0 (Tree x _ _) = (x:). We've generated n pairs of thunks, and forced the n left ones, leaving the right ones in memory; because the right sub-trees are unevaluated thunks they're constant space each, and there are only n of them, so only O(n) space total. And all the tree nodes leading to this path are now unreferenced.
We actually have the outermost list constructor (and the first element of the list). Forcing more of the list will explore those right sub-tree thunks further down the composition chain being built up, but there will never be more than n of them.
Technically you have bound a reference to tree 1 in the globally scoped treeOne, so actually you could retain a reference to every node you ever produce, so you're relying on GHC noticing that treeOne is only ever used once and shouldn't be retained.
I wrote a little manual evaluation of a tree to depth 2. I hope it can illustrate why tree nodes can be garbage collected along the way.
Suppose we start with a tree like this:
tree =
Tree
(Tree _ -- l
(Tree a _ _) -- ll
(Tree b _ _)) -- lr
(Tree _ -- r
(Tree c _ _) -- rl
(Tree d _ _)) -- rr
Now call depthNTree 2 tree:
go 2 tree []
go 2 (Tree _ l r) []
go 1 l (go 1 r [])
go 1 (Tree _ ll lr) (go 1 r [])
go 0 ll (go 0 lr (go 1 r []))
go 0 (Tree a _ _) (go 0 lr (go 1 r []))
a : go 0 lr (go 1 r []) -- gc can collect ll
a : go 0 (Tree b _ _) (go 1 r [])
a : b : go 1 r [] -- gc can collect lr and thus l
a : b : go 1 (Tree _ rl rr) []
a : b : go 0 rl (go 0 rr [])
a : b : go 0 (Tree c _ _) (go 0 rr [])
a : b : c : go 0 rr [] -- gc can collect rl
a : b : c : go 0 (Tree d _ _) []
a : b : c : d : [] -- gc can collect rr and thus r and tree
Note that since treeOne is a static value, there has to be some extra machinery behind the scenes to allow garbage collection of it. Fortunately GHC supports GC of static values.
Let's rewrite the recursive case of go as
go n t = case t of
Tree _ l r -> go (n - 1) l . go (n - 1) r
In the right-hand side of the case alternative, the original tree t is no longer live. Only l and r are live. So, if we recurse into l first, say, there is nothing keeping the left-hand side of the tree live except l itself; r exactly keeps the right-hand side of the tree alive.
At any point in the recursion, the live nodes are exactly the roots of the subtrees cut off by the path from the original root of the tree to the node currently being inspected which have not already been processed. There are at most the length of said path of these subtrees, so the space usage is O(n).
The key is that the original tree t becomes dead before we recurse. If you write the (denotationally equivalent, but bad style for a number of reasons)
leftChild (Tree _ l r) = l
rightChild (Tree _ l r) = r
go n t = go (n - 1) (leftChild t) . go (n - 1) (rightChild t)
now when recursing into go (n - 1) (leftChild t), there is still a live reference to t in the unevaluated expression rightChild t. Hence the space usage is now exponential.
In Haskell, one can do filters, sums, etc on infinite lists in constant space, because Haskell only produces list nodes when needed, and garbage collects ones it's finished with.
I'd like this to work with infinite trees.
Below is a rather silly program that generates a infinite binary tree with nodes representing the natural numbers.
I've then written a function that does a depth first traversal of this tree, spitting out the nodes at a particular level.
Then I've done a quick sum on the nodes divisable by 5.
In theory, this algorithm could be implemented in O(n) space for an n depth tree of O(2^n) nodes. Just generate the tree on the fly, removing the nodes you've already completed processing.
Haskell does generate the tree on the fly, but doesn't garbage collect the nodes it seems.
Below is the code, I'd like to see code with a similar effect but that doesn't require O(2^n) space.
import Data.List (foldl')
data Tree = Tree Int Tree Tree
tree n = Tree n (tree (2 * n)) (tree (2 * n + 1))
treeOne = tree 1
depthNTree n x = go n x id [] where
go :: Int -> Tree -> ([Int] -> [Int]) -> [Int] -> [Int]
go 0 (Tree x _ _) acc rest = acc (x:rest)
go n (Tree _ left right) acc rest = t2 rest where
t1 = go (n - 1) left acc
t2 = go (n - 1) right t1
main = do
x <- getLine
print . foldl' (+) 0 . filter (\x -> x `rem` 5 == 0) $ depthNTree (read x) treeOne
Your depthNTree uses 2^n space because you keep the left subtree around through t1 while you're traversing the right subtree. The recursive call on the right subtree should contain no reference to the left, as a necessary condition for incrementally garbage collected traversals.
The naive version works acceptably in this example:
depthNTree n t = go n t where
go 0 (Tree x _ _) = [x]
go n (Tree _ l r) = go (n - 1) l ++ go (n - 1) r
Now main with input 24 uses 2 MB space, while the original version used 1820 MB. The optimal solution here is similar as above, except it uses difference lists:
depthNTree n t = go n t [] where
go 0 (Tree x _ _) = (x:)
go n (Tree _ l r) = go (n - 1) l . go (n - 1) r
This isn't much faster than the plain list version in many cases, because with tree-depths around 20-30 the left nesting of ++ isn't very costly. The difference becomes more pronounced if we use large tree depths:
print $ sum $ take 10 $ depthNTree 1000000 treeOne
On my computer, this runs in 0.25 secs with difference lists and 1.6 secs with lists.
Easier to show than to explain. I have this tiny function to do base conversion from base 10:
demode 0 _ = []
demode n b = m:(demode d b)
where (d, m) = divMod n b
So, if we want to see how we would write 28 in base 9, demode 28 9 = [1,3].
But, of course, we have then to invert the list so it looks like a 31.
This could be easily made by making a function that calls 'demode' and then reverses it result, but with Haskell being so cool and all that there's probably a more elegant way of saying "in the end case (demode 0 _), append everything to a list and then reverse the list".
Note that base conversion is just an example I'm using to illustrate the question, the real question is how to apply a final transformation to the last result of a recursive function.
Nope. Your only hope is to use a helper function. Note that Haskell does allow you to define functions in where clauses (at least for now), so that doesn't have to be a 'separate function' in the sense of a separate top-level definition. You have basically two choices:
Add an accumulator and do whatever work you want to do in the end:
demode n b = w n [] where
w 0 xn = reverse xn
w n xn = w d (xn ++ [m]) where
(d, m) = divMod n b
Hopefully you can follow how that would work, but note that, in this case, you are far better off saying
demode n b = w n [] where
w 0 xn = xn
w n xn = w d (m : xn) where
(d, m) = divMod n b
which builds the list in reversed order and returns that.
Push the regular definition down to a helper function, and wrap that function in whatever work you want:
demode n b = reverse (w n) where
w 0 = []
w n = m : w d where
(d, m) = divMod n b
(I've used the term w as a short-hand for 'worker' in all three examples).
Either case can generally benefit from learning to do your recursions using higher-order functions, instead.
In general, it's somewhat bad style in Haskell to try to do 'everything in one function'; Haskell style is built around dividing a problem into multiple parts, solving those with separate functions, and composing the resulting functions together; especially if those functions will be useful elsewhere as well (which happens more often than you might naively expect).
This question already has answers here:
When is memoization automatic in GHC Haskell?
(4 answers)
Closed 6 years ago.
I have to generate a tree whose branches represent various sequences of choices. I have three rows (front, middle and back) and a set number of items that can enter each row. Each node in the tree represents an order for items to enter. I am using Data.Tree.unfoldTree as follows:
import Data.Tree as Tree
import Data.Seq as Seq
data RowType = Front | Middle | Back
data ChoiceTreeBuildLabel = Label (Seq.Seq RowType) Int Int Int
choiceTreeBuilder :: ChoiceTreeBuildLabel ->
(Seq.Seq RowType, [ChoiceTreeBuildLabel])
choiceTreeBuilder (Label rt f m b) = (rt, concat [ff,mf,bf])
where
ff = if f == 0 then [] else [Label (Front <| rt) (f-1) m b]
mf = if m == 0 then [] else [Label (Middle <| rt) f (m-1) b]
bf = if b == 0 then [] else [Label (Back <| rt) f m (b-1)]
choiceTree f m b = Tree.unfoldTree choiceTreeBuilder (Label Seq.empty f m b)
However, when for example evaluating length . last . Tree.levels $ choiceTree 3 4 5, it does not seem that the value of this function is stored. That is, the function seems to need to generate a new tree, regardless of whether the function has already been evaluated. I may be wrong, but I was under the impression that Haskell should only need to compute this one time, and indeed that is what I had intended. Can someone help me out?
Haskell does not memoize function application in general! Otherwise, imagine how much memory the runtime would use as any application of anything got memoized. Not to mention which, how would it even "know" that you were calling with the "same" value if you did not provide some appropriate notion of "sameness" yourself.
You can "purely" memoize a function either by hand, or with an appropriate helper package. Two common ones people tend to use are MemoTrie (http://hackage.haskell.org/package/MemoTrie) and memocombinators (http://hackage.haskell.org/package/data-memocombinators)
A more complex but similar notion is provided by the Representable Trie library (http://hackage.haskell.org/package/representable-tries-3.0.2/docs/Data-Functor-Representable-Trie.html)