Finding the leaves of an inductively-defined tree - haskell

So, I have a function of type:
genTree :: Node -> [Nodes]
Given a node, this function generates the set of children of that node in a tree. The function can be applied again to those children to generate their children, until it eventually generates a node with no children, i.e. a node for which genTree returns [].
What I'm trying to do is, given a starting node, generate the list of all leaf nodes in the tree that has it as the root.
Any advice?

The function from Martijn's answer generates a list of all nodes in the tree. You can use this list and filter out the nodes without children to get the leaves:
nodes root = root : concatMap nodes (genTree root)
leaves root = filter (null . genTree) (nodes root)
You can also combine these two functions into one to directly generate just a list of leaves, if you prefer:
leaves node
| null children = [node]
| otherwise = concatMap leaves children
where children = genTree node

Let's generalize it a bit:
leaves :: (a -> [a]) -> a -> [a]
leaves tree x = case (tree x) of
[] -> [x]
-- the node x has no children and is therefore a leaf
xs -> concatMap (leaves tree) xs
-- otherwise get list of all leaves for each child and concatenate them
Applying static argument transformation (http://hackage.haskell.org/trac/ghc/ticket/888), we get
leaves :: (a -> [a]) -> a -> [a]
leaves tree x = leaves' x where
leaves' x = case (tree x) of
[] -> [x]
xs -> concatMap leaves' xs
Use it as
leaves genTree root
or if you really want it to work only with genTree, inline it into the definition:
leaves1 root = case (genTree x) of
[] -> [x]
xs -> concatMap leaves1 xs
which is morally equivalent to sth's second answer.

(not exactly an answer to the question, but related)
I like to represent trees of a as "ListT [] a". (ListT from the List package in hackage)
Then the answer for this question is just to use the function lastL.
"Monad m => ListT m a" is a monadic list containing "a"s, where trying to get the next list item (which may find out there is no such item) is a monadic action in "m".
A usage example for ListT - a program that reads numbers from the user until the user does not type a number and prints the sum of numbers after each input:
main =
execute . joinM . fmap print .
scanl (+) 0 .
fmap (fst . head) .
takeWhile (not . null) .
fmap reads .
joinM $ (repeat getLine :: ListT IO (IO String))
Where repeat, scanl and takeWhile are from Data.List.Class. They work both for regular lists and monadic lists.
joinM :: List l => l (ItemM l a) -> l a -- (l = ListT IO, ItemM l = IO)
execute :: List l => l a -> ItemM l () -- consume the whole list and run its actions
If you are familiar with Python, python iterators/generators are "ListT IO"s.
When using [] instead of IO as the monad of the monadic list, the result is a tree. Why? Imagine a list where getting the next item is an action in the list monad - the list monad means there are several options, therefore there are several "next items", which makes it a tree.
You can construct monadic lists either with higher-order functions (like the example above), or with cons, or with a python-generator notation (with yield) using the GeneratorT monad transformer from the generator package in hackage.
Disclaimer: ListT and GeneratorT are in no way widely used. I wrote those and I am not aware of any other users except for myself. There are several of users of equivalent ListTs, such as the one from the Haskell wiki, NondetT, and others.

flatten node = node : concatMap flatten (genTree node)

Related

Tree traversal inorder tail recursion

Did I implement inorder level-order tree transversal using tail-recursion correctly?
inorder (Leaf n) temp = n:temp
inorder (Node (n, left, right)) temp = inorder left (n:inorder right temp)
inorder :: Tree a -> [a] -> [a]
Tree is declared as
data Tree a = Leaf a | Node (a, Tree a, Tree a) deriving Show
and returns
[2,1,3] on call inorder three [] where three = Node (1, Leaf 2, Leaf 3)
This technically isn't tail recursive because you have a recursive call inorder right temp in a nontail position. One way to fix this would be with continuations. You write a function which takes an accumulator like before, but rather than the accumulator being just a list it's actually a function representing the work left to do in the computation. This means that instead of making a non-tail call and just returning, we can always tail call because the context we need is saved to the continuation.
inorder = go id
where go :: ([a] -> r) -> Tree a -> r
go k Leaf = k []
go k (Node a l r) = go l (\ls -> go r (\rs -> k $ ls ++ n : rs))
Here every call is a tail call as required but it's quite innefficient because it requires a ++ operation at every level, pushing us into quadratic costs. A more efficient algorithm would avoid building up an explicit list and instead build up a difference list, delaying the construction on the concrete structure and giving a more efficient algorithm
type Diff a = [a] -> [a] -- A difference list is just a function
nil :: Diff a
nil xs = xs
cons :: a -> Diff a -> Diff a
cons a d = (:) a . d
append :: Diff a -> Diff a -> Diff a
append xs ys = xs . ys
toList :: Diff a -> a
toList xs = xs []
Note that all of these operations are O(1) except for toList which is O(n) in the number of entries. The important point here is that diff lists are cheap and easy to append so we'll construct these in our algorithm and construct the concrete list at the very end
inorder = go toList
where go :: (Diff a -> r) -> Tree a -> r
go k Leaf = k nil
go k (Node a l r) =
go l (\ls -> go r (\rs -> k $ ls `append` cons n rs))
And now, through gratuitous application of functions we've gotten a completely unidiomatic Haskell program. You see in Haskell we don't really care about tail calls because we generally want to handle infinite structures correctly and that's not really possible if we demand everything be tail recursive. In fact, I would say that while not tail recursive, the code you originally had is the most idiomatic, that's even how it's implemented in Data.Set! It has the property that we can lazily consume the result of that toList and it will work with us and lazily process the tree. So in your implementation, something like
min :: Tree a -> a
min = listToMaybe . toList
is going to be pretty darn close to how you would implement it by hand efficiency wise! It will not construct traverse the whole tree first like my version will have to. These sort of compositional effects of laziness pay more dividends in real Haskell code than syntactically making our code use only tail calls (which does nothing to actually guarantee space usage anyways).

Linking in tree structures

Upon working with long strings now, I came across a rather big problem in creating suffix trees in Haskell.
Some constructing algorithms (as this version of Ukkonen's algorithm) require establishing links between nodes. These links "point" on a node in the tree. In imperative languages, such as Java, C#, etc. this is no problem because of reference types.
Are there ways of emulating this behaviour in Haskell? Or is there a completely different alternative?
You can use a value that isn't determined until the result of a computation in the construction of data in the computation by tying a recursive knot.
The following computation builds a list of values that each hold the total number of items in the list even though the total is computed by the same function that's building the list. The let binding in zipCount passes one of the results of zipWithAndCount as the first argument to zipWithAndCount.
zipCount :: [a] -> [(a, Int)]
zipCount xs =
let (count, zipped) = zipWithAndCount count xs
in zipped
zipWithAndCount :: Num n => b -> [a] -> (n, [(a, b)])
zipWithAndCount y [] = (0, [])
zipWithAndCount y (x:xs) =
let (count', zipped') = zipWithAndCount y xs
in (count' + 1, (x, y):zipped')
Running this example makes a list where each item holds the count of the total items in the list
> zipCount ['a'..'e']
[('a',5),('b',5),('c',5),('d',5),('e',5)]
This idea can be applied to Ukkonen's algorithm by passing in the #s that aren't known until the entire result is known.
The general idea of recursively passing a result into a function is called a least fixed point, and is implemented in Data.Function by
fix :: (a -> a) -> a
fix f = let x = f x in x
We can write zipCount in points-free style in terms of zipWithAndCount and fix.
import Data.Function
zipCount :: [a] -> [(a, Int)]
zipCount = snd . fix . (. fst) . flip zipWithAndCount

List based on right Kan extension

In the ``Kan Extensions for Program Optimisation'' by Ralf Hinze there is the definition of List type based on right Kan extension of the forgetful functor from the category of monoids along itself (section 7.4). The paper gives Haskell implementation as follows:
newtype List a = Abstr {
apply :: forall z . (Monoid z) => (a -> z) -> z
}
I was able to define usual nil and cons constructors:
nil :: List a
nil = Abstr (\f -> mempty)
cons :: a -> List a -> List a
cons x (Abstr app) = Abstr (\f -> mappend (f x) (app f))
With the following instance of Monoid class for Maybe functor, I managed to define head function:
instance Monoid (Maybe a) where
mempty = Nothing
mappend Nothing m = m
mappend (Just a) m = Just a
head :: List a -> Maybe a
head (Abstr app) = app Just
Question: How can one define tail function?
Here is a rather principled solution to implementing head and tail in one go (full gist):
First of all, we know how to append lists (it will be useful later on):
append :: List a -> List a -> List a
append (Abstr xs) (Abstr ys) = Abstr (\ f -> xs f <> ys f)
Then we introduce a new type Split which we will use to detect whether a List is empty or not (and get, in the case it's non empty, a head and a tail):
newtype Split a = Split { outSplit :: Maybe (a, List a) }
This new type forms a monoid: indeed we know how to append two lists.
instance Monoid (Split a) where
mempty = Split Nothing
mappend (Split Nothing) (Split nns) = Split nns
mappend (Split mms) (Split Nothing) = Split mms
mappend (Split (Just (m, ms))) (Split (Just (n, ns))) =
Split $ Just (m, append ms (cons n ns))
Which means that we can get a function from List a to Split a using List a's apply:
split :: List a -> Split a
split xs = apply xs $ \ a -> Split $ Just (a, nil)
head and tail can finally be trivially derived from split:
head :: List a -> Maybe a
head = fmap fst . outSplit . split
tail :: List a -> Maybe (List a)
tail = fmap snd . outSplit . split
This implementation of lists as free monoids is provided in the package fmlist, which notes some interesting properties of it (unlike most implementations of lists, which are right-biased, this one is truly unbiased; you can make an arbitrary tree, and although of course the monoid laws force you to see it as flattened, you can still observe some differences in the infinite case. This is almost a Haskell quirk -- usually, free monoids). It also has an implementation of tail, so that's sort of an answer to your question (but see below).
With these sorts of representations (not just this particular one one, but also e.g. forall r. (a -> r -> r) -> r -> r lists), there are usually some operations (e.g. appending) that become easier, and some (e.g. zip and tail) that become more difficult. This is discussed a bit in various places, e.g. How to take the tail of a functional stream.
Looking more closely at fmlist, though, its solution is pretty unsatisfactory: It just converts the nice balanced tree that you give it to a right-biased list using foldr, which allows it to do regular list operations, but loses the monoidal structure. The tail of a "middle-infinite" list is no longer "middle-infinite", it's just right-infinite like a regular list.
It should be possible to come up with a clever Monoid instance to compute the tail while disturbing the rest of the structure as little as possible, but an obvious one doesn't come to mind off-hand. I can think of a non-clever "brute force" solution, though: Cheat and reify the "list" into a tree using an invalid Monoid instance, inspect the tree, and then fold it back up so the end result is valid. Here's what it would look like with my nonfree package and fmlist:
nail :: FM.FMList a -> FM.FMList a
nail (FM.FM k) = FM.FM $ \f -> foldMap f (nail' (k N))
nail' :: N a -> N a
nail' NEmpty = error "nail' NEmpty"
nail' (N x) = NEmpty
nail' (NAppend l r) =
case normalize l of
NEmpty -> nail' r
N x -> r
l' -> NAppend (nail' l') r
-- Normalize a tree so that the left side of a root NAppend isn't an empty
-- subtree of any shape. If the tree is infinite in a particular way, this
-- won't terminate, so in that sense taking the tail of a list can make it
-- slightly worse (but you were already in pretty bad shape as far as
-- operations on the left side are concerned, and this is a pathological case
-- anyway).
normalize :: N a -> N a
normalize (NAppend l r) =
case normalize l of
NEmpty -> normalize r
l' -> NAppend l' r
normalize n = n

Building a BST from a depth-first preorder list in Haskell more idiomatically

This submission to Programming Praxis gives an O(n) function that "undoes" a preorder traversal of a binary search tree, converting a list back into a tree. Supplying the missing data declaration:
data Tree a = Leaf | Branch {value::a, left::Tree a, right:: Tree a}
deriving (Eq, Show)
fromPreOrder :: Ord a => [a] -> Tree a
fromPreOrder [] = Leaf
fromPreOrder (a:as) = Branch a l (fromPreOrder bs)
where
(l,bs) = lessThan a as
lessThan n [] = (Leaf,[])
lessThan n all#(a:as)
| a >= n = (Leaf,all)
| otherwise = (Branch a l r,cs)
where (l,bs) = lessThan a as
(r,cs) = lessThan n bs
It's obvious that one constructor is added to the tree in each recursive step, which is key to its efficiency.
The only "problem" is that the list is threaded through the computation manually, which is not a terribly Haskellian way to do it and makes it a little harder to see that it is actually consumed element by element in a single-threaded manner.
I attempted to correct this using a state monad (prettified on Codepad):
import Control.Monad.State
data Tree a = Leaf
| Branch {root::a, left::Tree a, right::Tree a}
deriving (Eq,Show)
peek = State peek' where
peek' [] = (Nothing,[])
peek' a#(x:_) = (Just x,a)
pop = State pop' where
pop' [] = error "Tried to read past the end of the list"
pop' (_:xs) = ((),xs)
prebuild'::Ord a => State [a] (Tree a)
prebuild' = do
next <- peek
case next of
Nothing -> return Leaf
Just x -> do
pop
leftpart <- lessThan x
rightpart <- prebuild'
return (Branch x leftpart rightpart)
lessThan n = do
next <- peek
case next of
Nothing -> return Leaf
Just x -> do
if x < n
then do
pop
leftpart <- lessThan x
rightpart <- lessThan n
return (Branch x leftpart rightpart)
else
return Leaf
prebuild::Ord a => [a] -> Tree a
prebuild = evalState prebuild'
Unfortunately, this just looks obscenely messy, and doesn't seem any easier to reason about.
One thought I haven't been able to get anywhere with yet (in part because I don't have a deep enough understanding of the underlying concepts, quite likely): could I use a left fold over the list that builds a continuation that ultimately produces the tree? Would that be possible? Also, would it be anything short of insane?
Another thought was to write this as a tree unfold, but I don't think it's possible to do that efficiently; the list will end up being traversed too many times and the program will be O(n^2).
Edit
Taking things from another direction, I have the nagging suspicion that it might be possible to come up with an algorithm that starts by splitting up the list into increasing segments and decreasing segments, but I haven't yet found something concrete to do with that idea.
I think the problem you're having with State is that your primitives (push, pop, peek) are not the right ones. I think a better one would be something like available_, which checks if the front of the stack matches a particular condition, and executes something different in each case:
available_ p f m = do
s <- get
case s of
x:xs | p x -> put xs >> f x
_ -> m
Actually, in our use case, we can specialize a bit: we will always want to return a Leaf when the head of our stack doesn't satisfy the condition, and we'll always want to recurse when it does.
available p m = available_ p
(\x -> liftM2 (Branch x) (lessThan' x) m)
(return Leaf)
(You could also just write available to begin with and skip available_ entirely. In my first iteration, that is what I did.) Now writing fromPreOrder and lessThan are a snap, and also I think give some insight into their behavior. I'll name them with primes so we can double-check they do the right thing with QuickCheck.
fromPreOrder' = available (const True) fromPreOrder'
lessThan' n = available (<n) (lessThan' n)
And in ghci:
> quickCheck (\xs -> fromPreOrder (xs :: [Int]) == evalState fromPreOrder' xs)
+++ OK, passed 100 tests.
While I can't answer the question about continuation passing, I believe that the State monad based implementation can be written much more clearly. First, we can use notational convenience such as those from Control.Applicative to make it easier to read. Second, we can upgrade the effect stack to include Maybe in order to capture the notion of failure (a) from taking the head of an empty list and (b) from the (a >= n) comparison as an effect.
import Control.Monad.State
import Control.Applicative
The final code uses the backtracking-state monad transformer stack. This means that we wrap State around Maybe instead of Maybe around State. In some sense we can think of this as meaning that failure is the "primary" effect. In practice it means that if the algorithm fails there's no way to continue using potentially bad state and so it must backtrack to the last known good state.
type Preord a b = StateT [a] Maybe b
Since we keep taking the head of a list and want to capture that failure correctly, we'll use a "safe head" function (which is the natural destructor of a list anyway, despite not being in the base Haskell libraries)
-- Safe list destructor
uncons :: [a] -> Maybe (a, [a])
uncons [] = Nothing
uncons (a:as) = Just (a, as)
If we look at it cleverly we'll notice that this is already exactly the form of our monadic computation (StateT [a] Maybe b is isomorphic to [a] -> Maybe (b, [a])). We'll give it a more evocative name when lifted into the Monad.
-- Try to get the head or fail
getHead :: Preord a a
getHead = StateT uncons
A common feature of this algorithm is stopping local failures by providing a default value. I'll capture this in the certain combinator
-- Provides a default value for a failing computation
certain :: b -> Preord a b -> Preord a b
certain def p = p <|> return def
And now we can write the final algorithm very cleanly in our Preord monad.
fromPreOrder :: Ord a => Preord a (Tree a)
fromPreOrder = certain Leaf $ do
a <- getHead
Branch a <$> lessThan a <*> fromPreOrder
lessThan :: Ord a => a -> Preord a (Tree a)
lessThan n = certain Leaf $ do
a <- getHead
guard (a < n)
Branch a <$> lessThan a <*> lessThan n
Note that Applicative style helps to indicate that we're building the components of the Branch constructor using further effectful (state consuming) computations. The guard short-circuits lessThan when the pivot is already the least element in the pre-order traversal. We also explicitly see how both fromPreOrder and lessThan default out to Leaf when they cannot compute a better result.
(Also note that fromPreOrder and lessThan are nearly identical now, a commonality Daniel Wagner exploited in his own answer when writing available.)
We finally would want to hide all the monadic noise since, to an outside user, this is just a pure algorithm.
rebuildTree :: [a] -> Tree a
rebuildTree = fromMaybe Leaf . runStateT fromPreOrder
For a complete picture, here's the implementation of the algorithm using only the State monad. Note all the extra noise for handling failure! We've absorbed the entire popElse function into the effects of the backtracking state monad. We also lift the if up into the failure effect. Without that effect stack, our combinators are terrifically specific to the application instead of decomplected and useful elsewhere.
-- Try to take the head of the state list and return the default
-- if that's not possible.
popElse :: b -> (a -> State [a] b) -> State [a] b
popElse def go = do
x <- get
case x of
[] -> return def
(a:as) -> put as >> go a
push :: a -> State [a] ()
push a = modify (a:)
fromPreOrder :: Ord a => State [a] (Tree a)
fromPreOrder = popElse Leaf $ \a -> Branch a <$> lessThan a <*> fromPreOrder
lessThan :: Ord a => a -> State [a] (Tree a)
lessThan n =
popElse Leaf $ \a ->
if a >= n
then push a >> return Leaf
else Branch a <$> lessThan a <*> lessThan n
As you've said, the state monad doesn't really improve the situation, and I don't think it can be expected to, as it's both much too general in that it allows arbitrary access to the state, and annoying in that it enforces unnecessary sequencing.
At first glance, this looks quite like a foldr : we do one thing for the empty case, and in the (:) case we take the head off and make a recursive call based on the tail. However, as the recursive call isn't just using the tail directly, it isn't quite a foldr.
We could express it as a paramorphism but I don't think that really adds anything to the readability.
What I did notice is that the complicated recursion on the tail is all based on lessThan, which led me to the following idea for breaking down the algorithm:
lessThans [] = []
lessThans (a:as) = (a, l) : lessThans bs
where (l, bs) = lessThan a as
fromPreOrder2 :: Ord a => [a] -> Tree a
fromPreOrder2 = foldr (\(a, l) r -> Branch a l r) Leaf . lessThans
I'm sure lessThans could have a better name but I'm not quite sure what!
The foldr can also be expressed as foldr (uncurry Branch) Leaf but I'm not sure if that's an improvement.
EDIT: also, lessThans is an unfoldr, leading to this version:
fromPreOrder3 :: Ord a => [a] -> Tree a
fromPreOrder3 = foldr (uncurry Branch) Leaf . unfoldr lessThanList
lessThanList [] = Nothing
lessThanList (a:as) = Just ((a, l), bs)
where (l, bs) = lessThan a as

How can I iterate over a string without recursion?

isTogether' :: String -> Bool
isTogether' (x:xs) = isTogether (head xs) (head (tail xs))
For the above code, I want to go through every character in the string. I am not allowed to use recursion.
isTogether' (x:xs) = isTogether (head xs) (head (tail xs))
If I've got it right, you are interested in getting consequential char pairs from some string. So, for example, for abcd you need to test (a,b), (b,c), (c,d) with some (Char,Char) -> Bool or Char -> Char -> Bool function.
Zip could be helpful here:
> let x = "abcd"
> let pairs = zip x (tail x)
it :: [(Char, Char)]
And for some f :: Char -> Char -> Bool function we can get uncurry f :: (Char, Char) -> Bool.
And then it's easy to get [Bool] value of results with map (uncurry f) pairs :: [Bool].
In Haskell, a String is just a list of characters ([Char]). Thus, all of the normal higher-order list functions like map work on strings. So you can use whichever higher-order function is most applicable to your problem.
Note that these functions themselves are defined recursively; in fact, there is no way to go through the entire list in Haskell without either recursing explicitly or using a function that directly or indirectly recurses.
To do this without recursion, you will need to use a higher order function or a list comprehension. I don't understand what you're trying to accomplish so I can only give generic advice. You probably will want one of these:
map :: (a -> b) -> [a] -> [b]
Map converts a list of one type into another. Using map lets you perform the same action on every element of the list, given a function that operates on the kinds of things you have in the list.
filter :: (a -> Bool) -> [a] -> [a]
Filter takes a list and a predicate, and gives you a new list with only the elements that satisfy the predicate. Just with these two tools, you can do some pretty interesting things:
import Data.Char
map toUpper (filter isLower "A quick test") -- => "QUICKTEST"
Then you have folds of various sorts. A fold is really a generic higher order function for doing recursion on some type, so using it takes a bit of getting used to, but you can accomplish pretty much any recursive function on a list with a fold instead. The basic type of foldr looks like this:
foldr :: (a -> b -> b) -> b -> [a] -> b
It takes three arguments: an inductive step, a base case and a value you want to fold. Or, in less mathematical terms, you could think of it as taking an initial state, a function to take the next item and the previous state to produce the next state, and the list of values. It then returns the final state it arrived at. You can do some pretty surprising things with fold, but let's say you want to detect if a list has a run of two or more of the same item. This would be hard to express with map and filter (impossible?), but it's easy with recursion:
hasTwins :: (Eq a) => [a] -> Bool
hasTwins (x:y:xs) | x == y = True
hasTwins (x:y:xs) | otherwise = hasTwins (y:xs)
hasTwins _ = False
Well, you can express this with a fold like so:
hasTwins :: (Eq a) => [a] -> Bool
hasTwins (x:xs) = snd $ foldr step (x, False) xs
where
step x (prev, seenTwins) = (x, prev == x || seenTwins)
So my "state" in this fold is the previous value and whether we've already seen a pair of identical values. The function has no explicit recursion, but my step function passes the current x value along to the next invocation through the state as the previous value. But you don't have to be happy with the last state you have; this function takes the second value out of the state and returns that as the overall return value—which is the boolean whether or not we've seen two identical values next to each other.

Resources