Working with Trees in Haskell - haskell

I have this data definition for a tree:
data Tree = Leaf Int | Node Tree Int Tree
and I have to make a function, nSatisfy, to check how many items of the tree check some predicate.
Here's what I've done:
nSatisfy :: (Int->Bool) -> Tree -> Int
nSatisfy _ Leaf = 0
nSatisfy y (Node left x right)
|y x = 1 + nSatisfy y (Node left x right)
| otherwise = nSatisfy y (Node left x right)
Is this the right way to solve this problem?

In your nSatisfy function, you should add the number of nodes satisfying the condition in both subtrees with two recursive calls. The last two lines should be like this:
|x y=1+(nSatisfy y left)+(nSatisfy y right)
|otherwise=(nSatisfy y left)+(nSatisfy y right)
This way, it will call itself again on the same node but only on the subtrees.
Also, if a leaf contains an integer, as is implied in the data declaration, you should make it evaluate the condition for a leaf and return 1 if it is true, instead of always returning 0.

In addition to the main answer, I'd like to offer a slightly different way how to generalize your problem and solving it using existing libraries.
The operation you're seeking is common to many data structures - to go through all elements and perform some operation on them. Haskell defines Foldable type-class, which can be implemented by structures like yours.
First let's import some modules we'll need:
import Data.Foldable
import Data.Monoid
In order to use Foldable, we need to generalize the structure a bit, in particular parametrize its content:
data Tree a = Leaf a | Node (Tree a) a (Tree a)
In many cases this is a good idea as it separates the structure from its content and allows it to be easily reused.
Now let's define its Foldable instance. For tree-like structures it's easier to define it using foldMap, which maps each element into a monoid and then combines all values:
instance Foldable Tree where
foldMap f (Leaf x) = f x
foldMap f (Node lt x rt) = foldMap f lt <> f x <> foldMap f rt
This immediately gives us the whole library of functions in the Data.Foldable module, such as searching for an element, different kinds of folds, etc. While a function counting the number of values satisfying some predicate isn't defined there, we can easily define it for any Foldable. The idea is that we'll use the Sum:
nSatisfy :: (Foldable f) => (a -> Bool) -> f a -> Int
nSatisfy p = getSum . foldMap (\x -> Sum $ if p x then 1 else 0)
The idea behind this function is simple: Map each value to 1 if it satisfies the predicate, otherwise to 0. And then folding with the Sum monoid just adds all values up.

Related

Reconstructing Huffman tree from (preorder) bitstring in Haskell

I have the following Haskell polymorphic data type:
data Tree a = Leaf Int a | Node Int (Tree a) (Tree a)
The tree will be compressed in a bitstring of 0s and 1s. A '0' signifies a Node and it is followed by the encoding of the left subtree, then the encoding of the right subtree. A '1' signifies a Leaf and is followed by 7 bits of information (for example it might be a char). Each node/leaf is supposed to also contain the frequency of the information stored, but this is not important for this problem (so we can put anything there).
For example, starting from this encoded tree
[0,0,0,1,1,1,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,
1,0,0,1,1,1,1,1,1,1,0,0,1,0,0,1,1,1,1,0,1,1,1,1,1,1,0,0,0,0,1]
it is supposed to give back something like this
Node 0 (Node 0 (Node 0 (Leaf 0 'k') (Leaf 0 't'))
(Node 0 (Node 0 (Leaf 0 'q') (Leaf 0 'g')) (Leaf 0 'r')))
(Node 0 (Leaf 0 'w') (Leaf 0 'a'))
(spacing is not important, but it did not fit on one line).
I have little experience working with trees, especially when implementing code. I have a vague idea about how I'd solve this on paper (using something similar to a stack to deal with the depth/levels) but I am still a bit lost.
Any help or ideas are appreciated!
Well, you're trying to parse a tree of bytes from a bit-stream. Parsing's one of those cases where it pays to set up some structure: we're going to write a miniature parser combinator library in the style of How to Replace Failure by a List of Successes, which will allow us to write our code in an idiomatic functional style and delegate a lot of the work to the machine.
Translating the old rhyme into the language of monad transformers, and reading "string" as "bit-string", we have
newtype Parser a = Parser (StateT [Bool] [] a)
deriving (Functor, Applicative, Monad, Alternative)
runParser :: Parser a -> [Bool] -> [(a, [Bool])]
runParser (Parser m) = runStateT m
A parser is a monadic computation which operates statefully on a stream of Booleans, yielding a collection of successfully-parsed as. GHC's GeneralizedNewtypeDeriving superpowers allow me to elide the boilerplate instances of Monad et al.
The goal, then, is to write a Parser (Tree SevenBits) - a parser which returns a tree of septuples of Booleans. (You can turn the 7 bits into a Word8 at your leisure by deriving a Functor instance for Tree and using fmap.) I'm going to use the following definition of Tree because it's simpler - I'm sure you can figure out how to adapt this code to your own ends.
data Tree a = Leaf a | Node (Tree a) (Tree a) deriving Show
type SevenBits = (Bool, Bool, Bool, Bool, Bool, Bool, Bool)
Here's a parser that attempts to consume a single bit from the input stream, failing if it's empty:
one :: Parser Bool
one = Parser $ do
stream <- get
case stream of
[] -> empty
(x:xs) -> put xs *> return x
Here's one which attempts to consume a particular bit from the input stream, failing if it doesn't match:
bit :: Bool -> Parser ()
bit b = do
i <- one
guard (i == b)
Here I'm pulling a sequence of seven Booleans from the input stream using replicateM and packing them into a tuple. We'll be using this to populate Leaf nodes' contents.
sevenBits :: Parser SevenBits
sevenBits = pack7 <$> replicateM 7 one
where pack7 [a,b,c,d,e,f,g] = (a, b, c, d, e, f, g)
Now we can finally write the code which parses the tree structure itself. We'll be choosing between the Node and Leaf alternatives using <|>.
tree :: Parser (Tree SevenBits)
tree = node <|> leaf
where node = bit False *> liftA2 Node tree tree
leaf = bit True *> fmap Leaf sevenBits
If node succeeds in parsing a low bit from the head of the stream, it continues to recursively parse the encoding of the left subtree followed by the right subtree, sequencing the applicative actions with liftA2. The trick is that node fails if it doesn't encounter a low bit at the head of the input stream, which tells <|> to give up on node and try leaf instead.
Note how the structure of tree reflects the structure of the Tree type itself. This is applicative parsing at work. We could alternately have structured this parser monadically, first using one to parse an arbitrary bit and then using a case analysis on the bit to determine whether we should continue to parse a pair of trees or a leaf. In my opinion this version is simpler, more declarative, and less verbose.
Also compare the clarity of this code to the low-level style of #behzad.nouri's foldr-based solution. Rather than building an explicit finite-state machine which switches between parsing nodes and leaves - an imperative-flavoured idea - my design allows you to declaratively describe the grammar to the machine using standard functions like liftA2 and <|> and trust that the abstractions will do the right thing.
Anyway, here I'm parsing a simple tree consisting of a pair of Leafs containing the (binary-encoded) numbers 0 and 1. As you can see, it returns the single successful parse and an empty stream of remaining bits.
ghci> runParser tree $ map (>0) [0, 1, 0,0,0,0,0,0,0, 1, 0,0,0,0,0,0,1]
[(Node (Leaf (False, False, False, False, False, False, False)) (Leaf (False, False, False, False, False, False, True)),[])]
Ok, here's a simple (ad-hoc, but easier to understand) way.
We need to buid a function parse, with the following type:
parse :: [Int] -> Tree Char
The approach you mentioned, with stacks, is the imperative one. Here we just lay on the recursive calls. The stack will be built by the compiler and it will just have each recursive call stored in it (At least you can imagine it that way, if you want, or just ignore all this paragraph).
So, the idea is the following: whenever you find a 0, you need to make two recursive calls to the algorithm. The first recursive call will read one branch (the left one) of the tree. The second one needs to be called with the rest of the list as argument. The rest left by the first recursive call. So, we need a auxiliar function parse' with the following type (now we return a pair, being the second value the rest of list):
parse' :: [Int] -> (Tree Char, [Int])
Next, you can see a piece of code where the 0 case is just as described before.
For the 1 case, we just need to take the next 7 numbers and make them into a char somehow (I leave the definition of toChar for you), then, just return a Leaf and the rest of the list.
parse' (0:xs) = let (l, xs') = parse' xs
(r, xs'') = parse' xs' in (Node 0 l r, xs'') --xs'' should be []
parse' (1:xs) = let w = toChar (take 7 xs) in (Leaf 0 w , drop 7 xs)
Finally, our parse function just calls the auxiliary parse one and returns the first element of the pair.
parse xs = fst $ parse' xs
do a right fold:
import Data.Char (chr)
data Tree a = Leaf a | Node (Tree a) (Tree a)
deriving Show
build :: [Int] -> [Tree Char]
build xs = foldr go (\_ _ -> []) xs 0 0
where
nil = Leaf '?'
go 0 run 0 0 = case run 0 0 of
[] -> Node nil nil:[]
x:[] -> Node x nil:[]
x:y:zs -> Node x y :zs
go 1 run 0 0 = run 0 1
go _ _ _ 0 = error "this should not happen!"
go x run v 7 = (Leaf $ chr (v * 2 + x)): run 0 0
go x run v k = run (v * 2 + x) (k + 1)
then:
\> head $ build [0,0,0,1,1,1,0, ...] -- the list of 01s as in the question
Node (Node (Node (Leaf 'k') (Leaf 't'))
(Node (Node (Leaf 'q') (Leaf 'g')) (Leaf 'r')))
(Node (Leaf 'w') (Leaf 'a'))

How do I print a list of nodes at a specific level of a general tree?

I am given the following tree definition and functions:
data Tree a = Node a [Tree a] deriving (Eq, Read, Show)
treeRoot :: Tree a -> a
treeRoot (Node a _) = a
treeSubtrees :: Tree a -> [Tree a]
treeSubtrees (Node _ subtrees) = subtrees
I need to create a function 'nodesAtLevel' that takes two arguments: a tree and an integer n >= 0. The function needs to return a list of all the nodes at the given level n. The root of the tree is level 0. I am very new to Haskell, and yes, this is part of a homework assignment, so if you could help walk me through the thought process of completing this, it would be greatly appreciated! I currently have the definition of the 'nodesAtLevel' function written as follows:
nodesAtLevel :: Int -> Tree a -> [a]
Ok, let's to this step by step (leaving you with some holes for now):
nodes at level 0
as I understand it you are suppost to only return the roots value here (of course in a list)
so what goes in the ... here?
nodesAtLevel 0 (Node a subtrees) = ...
nodes at deeper levels
well the structure is nicely recursive so we probably want to use recursion.
So let's go recursively deeper one level deeper (decreasing the way we have left to go - aka our n)
But there is a slight problem: nodesAtLevel acts on a single node but we only have a list of nodes left (the subtrees) .. so what to do?
Well maybe we should do something for each of the subtrees/subnodes - so we need to find the right function f here (hint you probably want to map and concat the results ...):
nodesAtLevel n (Node a subtrees) = f (nodesAtLevel (n-1)) subtrees
what can that f be? Can you guess it? Maybe you can write it yourself if not?
Additional Hint:
maybe you should first find out what type f has - because then you might be even able to use Hoogle to find a good fit.
If you use a newer GHC version you an let the compiler do the work if you write
nodesAtLevel n (Node _ subtrees) = _f (nodesAtLevel (n-1)) subtrees
where _f is a hole - now if you load this in GHCi it will give you a
Found hole ‘_f’ with type: ....
error with lot's of additional information ;)
so just find ... and f and this should work ;)
BTW you can use the functions you already have instead of pattern matching again as I did, but I feel it's easier to see the result this way - so if you follow this path you don't need to use treeSubtrees or treeRoot
solution
... = [a]
f = concatMap
so one possible way to do it is this:
nodesAtLevel :: Int -> Tree a -> [a]
nodesAtLevel 0 (Node a _) = [a]
nodesAtLevel n (Node _ subtrees) = concatMap (nodesAtLevel (n-1)) subtrees

If given a list of tuples representing ranges, how can you merge continuous ranges?

If given a list of tuples representing ranges like this:
[(0,10),(10,100),(1000,5000)]
I'd like to merge the tuples that represent contiguous ranges, so the result is this:
[(0,100),(1000,5000)]
Any elegant solutions?
Here's mine
mergeRanges :: [(Int, Int)] -> [(Int, Int)]
mergeRanges xs = foldr f [] (sort xs)
where f new#(x,y) acc#((a,b):ys) =
if y == a
then (x,b):ys
else new:acc
f x acc = x:acc
EDIT: Ranges are non-overlapping
Unless this is a pattern that shows up more often in your program, I would just go for a direct recursion (untested code follows!):
mergeRanges (lo1,hi1) : (lo2,hi2) : rest)
| hi1 == lo2 = mergeRanges ((lo1,hi2) : rest)
-- or (lo1,hi2) : mergeRanges rest, to merge only adjacent ranges
mergeRanges (interval:rest) = interval : mergeRanges rest
mergeRanges [] = []
(where you could optimize a bit by using #-patterns at the cost of clutter).
But if you really want to, you could use the following helper function
merge :: (a -> a -> Maybe a) -> [a] -> [a]
merge f [] = []
merge f [x] = [x]
merge f (x:y:xs) = case f x y of
Nothing -> x : merge f (y:xs)
Just z -> merge (z:xs) -- or z : merge xs
and give as first argument
merge2Ranges (lo1, hi1) (lo2, hi2)
| hi1 == lo2 = Just (lo1, hi2)
| otherwise = Nothing
I doubt that merge is in a library somewhere, since it's pretty specific to the problem at hand.
Well, I think the best solutions in this space probably will involve specialized data structures that maintain the invariant in question. In Java-land, the Guava library has RangeSet, which does precisely this.
This is not a solution to your problem directly, but once I was playing around with this simple (too simple) implementation of "historical values" as a kind of binary search tree:
-- | A value that changes over time at discrete moments. #t# is the timeline type,
-- #a# is the value type.
data RangeMap t a = Leaf a
-- Invariant: all #t# values in the left branch must be less than
-- the one in the parent.
| Split t (RangeMap a) (RangeMap a)
valueAt :: RangeMap t a -> t -> a
valueAt _ (Leaf a) = a
valueAt t (Split t' before since)
| t < t' = get t before
| otherwise = get t since
The idea here is that Split t beforeT sinceT divides the timeline into two branches, one for values that held before t and a second for those that held since t.
So represented in terms of this type, your range set could be represented something like this:
example :: RangeMap Int Bool
example = Split 1000 (Split 100 (Split 0 (Leaf False) (Leaf False))
(Leaf False))
(Split 5000 (Leaf True) (Leaf False))
There are a few neat things about this, compared to the [(since, until, value)] representation that I've used in the past for similar applications:
The tree representation makes it impossible to have conflicting a values for the same time range. RangeMap is a true function from t to a.
The tree representation guarantees that some a is assigned to every t. Again, a RangeMap is a true function from t to a.
Since it's a tree and not a list, it supports log-time operations.
I did not go as far as working out a balanced tree representation for this or figuring out how to merge adjacent ranges with the same value, however...

How to find path to a node in a Haskell Data.Tree

Given a tree in Haskell (represented by a Data.Tree), how could I find the path to a node?
e.g.
import Data.Tree
tree = Node 1 [Node 2 [Node 3 []], Node 4 []]
Which forms a tree that looks like:
1
|
+- 2
| |
| `- 3
|
`- 4
How could I make a function pathToNode such that:
pathToNode 0 tree => []
pathToNode 1 tree => [1]
pathToNode 2 tree => [1, 2]
pathToNode 3 tree => [1, 2, 3]
pathToNode 4 tree => [1, 4]
In my particular case, any given value will appear only once in the tree, so a solution that returns the a path to a value is acceptable.
So far my best answer is this:
pathToNode :: (Eq a) => a -> Tree a -> [a]
pathToNode x (Node y ys) | x == y = [x]
| otherwise = case concatMap (pathToNode x) ys of
[] -> []
path -> y:path
Is there a more succinct way of writing this? Is it possible to take advantage of Data.Foldable or Data.Traversable to avoid writing my own traverse logic?
The default Traversable and Foldable instances can't be used here, since they don't provide enough contextual information to maintain a path (e. g. when traversing in the State monad). They both visit each element of the tree once in some order, so you can't know whether some previously visited value belongs to a parent or sibling node of the current node.
I think the following function is succinct enough:
pathsToNode :: Eq a => a -> Tree a -> [[a]]
pathsToNode x (Node y ns) = [[x] | x == y] ++ map (y:) (pathsToNode x =<< ns)
It lists the paths to all copies of x, but you can always just lazily take the first found path if that's what you want.
There exists a generalization of the concept of a fold called a catamorphism. In the same way that a fold lets you "consume" a list without explicit recursion, a catamorphism lets you "consume" a tree or other datatype without explicit recursion and in a bottom-up manner, starting from the leaves. Unlike a regular fold, it will be aware of the structure of the tree.
The cata function can be found in module Data.Functor.Foldable (not Data.Foldable!) of package recursion-schemes. Unfortunately, it doesn't work with Data.Tree as such, you'll have to define an equivalent datatype in an indirect, two-step fashion:
{-# LANGUAGE DeriveFunctor #-}
import Data.Functor.Foldable
data Node a b = Node a [b] deriving (Functor,Eq)
type Tree a = Fix (Node a)
tree :: Tree Int
tree = Fix (Node 1 [ Fix ( Node 2 [ Fix (Node 3 []) ]),
Fix ( Node 4 [] ) ])
Using cata, we can construct a list of all the paths to all values in the tree. Notice the lack of explicit recursion:
paths :: Tree a -> [(a,[a])]
paths = cata algebra
where
algebra :: Node a [(a,[a])] -> [(a,[a])]
algebra (Node a as) = (a,[a]) : map (\(i,is)->(i,a:is)) (concat as)
And from that function, we can define pathToNode:
pathToNode :: (Eq a) => a -> Tree a -> [a]
pathToNode a = snd . head . filter ((==a).fst) . paths
This solution is not more succint I'm afraid, but catamorphims are a useful tool to have in your belt.

Haskell 2-3-4 Tree

We've been asked to create a 2-3-4 tree in Haskell, as in write the data type, the insert function, and a display function.
I'm finding it very difficult to get information on this kind of tree, even in a language I'm comfortable with (Java, C++).
What I have so far -
data Tree t = Empty
| Two t (Tree t)(Tree t)
| Three t t (Tree t)(Tree t)(Tree t)
| Four t t t (Tree t)(Tree t)(Tree t)(Tree t) deriving (Eq, Ord, Show)
leaf2 a = Two a Empty Empty
leaf3 a b = Three a b Empty Empty Empty
leaf4 a b c = Four a b c Empty Empty Empty Empty
addNode::(Ord t) => t -> Tree t -> Tree t
addNode t Empty = leaf2 t
addNode x (Two t left right)
| x < t = Two t (addNode x left) right
| otherwise = Two t left (addNode x right)
This compiles but I'm not sure if it's correct, but not sure how to start writing the insert into a three node or four node.
The assignment also says that "deriving show" for the display function is not enough, that it should print out the tree in the format normally seen in diagrams. Again, unsure on the way to go with this.
Any help or direction appreciated.
I know nothing about 2-3-4 trees, but for the Three node, you would start with something like this:
addNode t (Three x y left mid right)
| cond1 = expr1
| cond2 = expr2
(etc)
What cond1, cond2, expr1, and expr2 are, exactly, is dependent on the definition of what a 2-3-4 tree is.
As for a show method, the general outline would be this:
instance (Show t) => Show (Tree t) where
show Empty = ...
show (Two x l r) = ...show x...show l...show r...
show (Three x y l m r) = ...
show (Four x y z l m n r) = ...
The implementation depends on how you want it to look, but for the non-Empty cases, you will probably invoke show on all of the components of the tree being shown. If you want to indent the nested parts of the tree, then perhaps you should create a separate method:
instance (Show t) => Show (Tree t) where
show = showTree 0
showTree :: Show t => Int -> Tree t -> String
showTree n = indent . go
where indent = (replicate n ' ' ++)
go Empty = "Empty"
go (Two x l r) = (...show x...showTree (n+1) l...showTree (n+1) r...)
(etc)
We've been asked to create a 2-3-4 tree
My condolences. I myself once had to implement one for homework. A 2-3-4 tree is a B-tree with all the disadvantages of the B-tree and none of the advantages, because writing the cases separately for each number of children as you do is as cumbersome as having a list of only 2-4 elements.
Point being: B-tree insertion algorithms should work, just fix the size. Cormen et al. have pseudocode for one in their book Introduction to algorithms (heavy imperativeness warning!).
It might still be better to have lists of data elements and children instead of the four-case algebraic data type, even if the type wouldn't enforce the size of the nodes then. At least it would make it easier to expand the node size.

Resources