Adjacency list of a tree data structure in Haskell - haskell

I have the following abstract data type defined in Haskell:
data Trie = Leaf
| Node [(Char, Trie)]
deriving (Eq)
The Node type is a list of elements (c, t) where c is the label for the edge from the current node to t.
Now I want to print out the adjacency list of the tree. Specifically, I need to print one edge per row, where an edge is in the format:
n1 n2 c
with n1 the source, n2 the target, and c the label for the edge.
I can print the edges from my root node with
instance Show Trie where
show = show' 2 1
where show' _ _ Leaf = ""
show' next n1 (Node ts) = unlines $ zipWith (\n2 (c, _) ->
show n1 ++ " " ++ show n2 ++ " " ++ show c)
[next..] ts
but now I'm stuck how to recursively print the children. In particular, how do I number the children nodes?

Labeling nodes is quite trivial since GHC will do all the heavy lifting for you:
{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable #-}
import qualified Data.Traversable as T
import qualified Data.Foldable as F
import Control.Monad.State
data Trie a = Leaf a | Node a [(Char, Trie a)]
deriving (Eq, Functor, F.Foldable, T.Traversable)
number :: Trie a -> Trie (a, Int)
number = flip evalState 1 . T.mapM (\x -> state $ \n -> ((x,n),n+1))
As for printing the trie, I'm afraid that I don't quite understand the desired output.

I came up with this solution:
import Data.List (foldl')
enum :: Int -> Trie -> ([(Int,Int,Char)],Int)
enum x Leaf = ([],x+1)
enum x (Node pairs)
= let go (acc,y) (c,t) = (acc',y')
where acc' = [(x,y,c)] ++ edges ++ acc
(edges,y') = enum y t
in foldl' go ([],x+1) pairs
enum takes a starting id and a Trie and returns a list of edges and the next available id.
-- some examples:
leafs xs = [ (c,Leaf) | c <- xs ]
t1 = Node $ leafs "XYZ"
t2 = Node [('W', t1)]
t3 = Node $ [('A',t2)] ++ leafs "BC"
enum 1 t1 -- ([(1,4,'Z'),(1,3,'Y'),(1,2,'X')],5)
enum 1 t2 -- ([(1,2,'W'),(2,5,'Z'),(2,4,'Y'),(2,3,'X')],6)
enum 1 t3 -- ([(1,8,'C'),(1,7,'B'),(1,2,'A'),(2,3,'W'),(3,6,'Z'),(3,5,'Y'),(3,4,'X')],9)

Here's my attempt:
data Trie c =
Leaf
| Node [(c, Trie c)]
deriving (Eq)
instance Show c => Show (Trie c) where
show = show' 1 (\_ -> "\n") where
show' next cc Leaf = show next ++ "\n” ++ cc (next + 1)
show' next cc (Node []) = show next ++ "\n” ++ cc (next + 1)
show' next cc (Node [(c,t)] = show c ++ "(" ++ show next ++ ")” ++ show' (next+1) cc t
show' next cc (Node (x:xs)) = show' next (\n -> show' n cc $ Node xs) (Node [x])
I used continuation passing style to keep track of the state. There should be a way of making that code monadic, or using a zipper instead.
You may change the specific bits for leaves or nodes depending on whether you need them to be numbered or not (by changing the next + 1 part).

Related

Can uniplate's `universeBi` be used to retrieve nodes in a breadth-first fashion?

Is it possible to use Uniplate's universeBi to get the output in breadth-first-order? It appears the results are returned in a depth-first fashion. I'm wondering how I can use uniplate to retrieve the universeBi in a breadth-first fashion.
To illustrate, consider the following toy program:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data
import Data.Generics.Uniplate.Data
data A = A B Int deriving (Data, Typeable)
data B = B Int deriving (Data, Typeable)
val :: A
val = A (B 1) 2
ints :: [Int]
ints = universeBi val
I get:
*Main> ints
[1,2]
But this is depth-first, as 1 is obtained from the B node. I'd rather get it in the breadth-first order, i.e., receive [2,1]. Is this achievable in uniplate?
You can dig into the structure of the Str returned by biplate:
layers :: Str a -> [[a]]
layers Zero = []
layers (One x) = [[x]]
layers (Two f x) = catLayers (layers f) ([] : layers x)
where catLayers [] ys = ys
catLayers xs [] = xs
catLayers (x : xs) (y : ys) = (x ++ y) : catLayers xs ys
layersBi :: Biplate from to => from -> [[to]]
layersBi = layers . fst . biplate
breadthBi :: Biplate from to => from -> [to]
breadthBi = concat . layersBi
So now
breadthBi (A (B 1) 2) :: [Int]
-- = [2, 1]
and
data Tree a = Branch (Tree a) a (Tree a) | Leaf deriving (Data, Typeable)
-- 4
-- 2 6
-- 1 3 5 7
example = Branch (Branch (Branch Leaf 1 Leaf) 2 (Branch Leaf 3 Leaf)) 4 (Branch (Branch Leaf 5 Leaf) 6 (Branch Leaf 7 Leaf))
(layersBi :: Data a => Tree a -> [[a]]) example
-- = [[],[4],[2,6],[1,3,5,7]]
I'm not sure if it's actually guaranteed that Str exactly reflects the structure of the data type, but it appears to. You could instead cook something out of the Data primitives if you have to.

Building a list of all branches in a tree

I need to make function returns all possible branches from a tree
with this form:
data Tree a = EmptyT | NodeT a ( Tree a ) ( Tree a ) deriving (Show)
everyBranch :: Tree a -> [[a]]
I'm not sure how to approach this... xD
I'm still a newbie in Haskell.
Let's say that I have:
1
/ \
2 3
/\ / \
4 5 7 8
I want to get: [[1,2,4], [1,2,5], [1,3,8], [1,3,7]]
We'll use a recursive approach. Let's start with a rough skeleton:
everyBranch :: Tree a -> [[a]]
everyBranch EmptyT = _something
everyBranch (NodeT v (Tree l) (Tree r)) = _somethingElse
Now we'll fill in the holes. (This syntax is known as 'typed holes': if you run the above program through GHC, it will give you an error message with the type of the value which should be in the hole.) Now, I'm not sure about the first case: depending on your need, it could be [] (no branches) or [[]] (one branch with no elements), so we'll come back to this later. For the second case, we need a way to construct a list of branches given the value and the left and right subtrees. How do we do that? We'll recursively find every branch in the left tree, and every branch in the right tree, and then we'll prepend v to both:
everyBranch :: Tree a -> [[a]]
everyBranch EmptyT = _something
everyBranch (NodeT v l r) = map (v:) $ everyBranch l ++ everyBranch r
Now, let's go back to EmptyT. Consider a very simple tree: NodeT 1 EmptyT EmptyT. In this case, everyBranch should return [[1]]. Let's invoke everyBranch 'by hand' on this tree:
(I use └→ to mean 'evaluate sub-expression recursively', and => meaning 'expression evaluates to')
everyBranch (NodeT 1 EmptyT EmptyT)
=> map (1:) $ everyBranch EmptyT ++ everyBranch EmptyT
└→ everyBranch EmptyT
=> _something
=> map (1:) $ _something ++ _something
So here, we want map (1:) $ _something ++ _something to be equal to [[1]]. What is _something? Well, it turns out that if _something is [], then map (1:) $ [] ++ [] is [], which isn't what we want. On the other hand, if _something is [[]], then map (1:) $ [[]] ++ [[]] is [[1], [1]] - which isn't what we want either. It looks like we need a slightly different approach. What we'll do is, we'll add another case specifically for these sort of trees:
everyBranch :: Tree a -> [[a]]
everyBranch EmptyT = _something
everyBranch (NodeT v EmptyT EmptyT) = [[v]]
everyBranch (NodeT v l r) = map (v:) $ everyBranch l ++ everyBranch r
Now, if we test this a bit (albeit using some random value for _something to stop it from giving us errors), we find that it works for all binary trees. As mentioned though, we still need to figure out that _something value. This value will only matter in two cases: empty trees (in which case it will trivially match EmptyT), and trees with only one subtree (in which case either l or r will match EmptyT). I will leave it as an exercise for you to determine what value to put there, how it will affect the result, and why it affects it that way.
We can derive and use Foldable, to fold into an ad-hoc monoid to do the job:
data Tree a = EmptyT
| NodeT a ( Tree a ) ( Tree a )
deriving (Show, Functor, Foldable)
data T a = T a -- tip
| N [[a]] -- node
| TN (a,[[a]]) -- tip <> node
| NN ([[a]],[[a]]) -- node <> node
deriving Show
instance Monoid (T a) where
mempty = N [] -- (tip <> node <> node) is what we actually want
mappend (T a) (N as) = TN (a,as) -- tip <> node
mappend (N as) (N bs) = NN (as,bs) -- node <> node
mappend (T a) (NN ([],[])) = N ([[a]]) -- tip <> (node <> node)
mappend (T a) (NN (as,bs)) = N (map (a:) as ++ map (a:) bs)
mappend (TN (a,[])) (N []) = N ([[a]]) -- (tip <> node) <> node
mappend (TN (a,as)) (N bs) = N (map (a:) as ++ map (a:) bs)
allPaths :: Tree a -> [[a]]
allPaths (foldMap T -> N ps) = ps
The allPaths function definition uses ViewPatterns. Testing,
> allPaths $ NodeT 1 (NodeT 2 (NodeT 3 EmptyT EmptyT) EmptyT)
(NodeT 5 EmptyT EmptyT)
[[1,2,3],[1,5]]
> allPaths $ NodeT 1 (NodeT 2 (NodeT 3 EmptyT EmptyT) (NodeT 4 EmptyT EmptyT))
(NodeT 5 EmptyT EmptyT)
[[1,2,3],[1,2,4],[1,5]]
(tip <> node <> node) is what we really want, but <> is binary, and we don't know (and shouldn't rely on it if we did) the actual order in which the parts will be combined into the whole by the derived definition of foldMap,
foldMap T EmptyT == N []
foldMap T (NodeT a lt rt) == T a <> foldMap T lt <> foldMap T rt
-- but in what order?
So we "fake", it by delaying the actual combination until all three parts are available.
Or we could forgo the derivation route altogether, use the above laws as the definition of a custom foldMap with a ternary combination, and end up with ... the equivalent of the recursive code in the other answer -- much shorter overall, without the utilitarian cruft of one-off auxiliary types that need to be hidden behind module walls, and self-evidently non-partial, unlike what we've ended up with, here.
So maybe it's not so great. I'll post it anyway, as a counterpoint.

How to seperate list elements as Char or Float

When I take a mixed tree from char and float I have to seperate them as a Float or Character in haskell and add them to the specific list I tried to write something as you can see below;
I tried to take a as [a] in the else part but it gives error too.
data BETree = Leaf Float | Node Char BETree BETree deriving (Show, Ord, Eq)
charList :: [Char]
charList = []
floatList :: [Float]
floatList = []
toList :: BETree -> ([Float], [Char])
toList (Node a l r) = if (a :: Char ) then (charList ++ [a])
else (floatList ++ a)
I expect to entered values to seperate for floatList and charList however I get errors like this;
Couldn't match expected type ‘[[Char]]’ with actual type ‘Char’
OR
vice versa
There are a couple of aspects of Haskell that you haven't mastered yet, and they're causing you some difficulty.
First, as you probably know, Haskell takes its types very seriously. It's a strongly typed language, so that means that the whole concept of searching through a data structure to find values of a particular type is the wrong way of thinking about this problem. The definition of BETree is:
data BETree = Leaf Float | Node Char BETree BETree deriving (Show, Ord, Eq)
which says that this structure consists of Leafs that contain a Float and internal Nodes that contain a Char. So, if you want to find all the Char values, you don't check the types, you just look for the Nodes. They will all contain Chars and can't contain anything else, by the definition of BETree. In other words, in your function definition:
toList (Node a l r) = ...
you don't need to try to check the type of a -- it's guaranteed to be Char by the definition of Node in the BETree definition. If you separately write a definition:
toList (Leaf x) = ...
then you're similarly guaranteed that x is a Float, and you don't need to check any types.
Second, Haskell normally works with immutable values. This means that, unlike in most other languages, you usually don't start by creating an empty list and then trying to add elements to it in a separate function. Instead, you usually write recursive functions that return the "list so far", which they generate by adding an element (or elements) to the list returned by recursively calling themselves. As a simple example, to write a function that builds up the list of all positive integers in an input list, you'd write:
positiveInts :: [Int] -> [Int]
positiveInts (x:xs) | x > 0 = x : positiveInts xs -- add "x" to list from recursive call
| otherwise = positiveInts xs -- drop "x"
positiveInts [] = []
So. here's how it might work for your problem, starting with the simpler problem of just building the floatList:
toFloatList :: BETree -> [Float]
toFloatList (Leaf x) = [x] -- x is guaranteed to be a Float, so return it
toFloatList (Node _a l r) = -- _a can't be a float, so ignore it
toFloatList l ++ toFloatList r -- but recurse to find more Floats in Leafs
And test it:
> toFloatList (Node 'x' (Leaf 1.0) (Node 'y' (Leaf 3.0) (Leaf 4.0)))
[1.0,3.0,4.0]
>
Building just the charList is only slightly more complicated:
toCharList :: BETree -> [Char]
toCharList (Leaf _x) = [] -- x is guaranteed to be a Float, so no Chars here
toCharList (Node a l r) = -- "a" is a Char
toCharList l ++ [a] ++ toCharList r -- recurse and put "a" in the middle
and testing it:
> toCharList (Node 'x' (Leaf 1.0) (Node 'y' (Leaf 3.0) (Leaf 4.0)))
"xy"
> "xy" == ['x','y']
True
>
In Haskell, the list of Chars ['x','y'] is equivalent to the string "xy" which is why it gets printed this way.
Now, the easiest way to define toList is:
toList :: BETree -> ([Float], [Char])
toList bet = (toFloatList bet, toCharList bet)
This traverses the tree twice. If you want to build both lists together in a single traversal, things get significantly more complicated:
toList' :: BETree -> ([Float], [Char])
toList' (Leaf x) = ([x],[]) -- easy, since Leaf contains only one Float
toList' (Node a l r) = -- Nodes are harder
let (fl1, cl1) = toList' l -- lists from the left branch
(fl2, cl2) = toList' r -- lists from the right branch
in (fl1 ++ fl2, cl1 ++ [a] ++ cl2) -- combine with our Char "a"
and the test:
> toList (Node 'x' (Leaf 1.0) (Node 'y' (Leaf 3.0) (Leaf 4.0)))
([1.0,3.0,4.0],"xy")
> toList' (Node 'x' (Leaf 1.0) (Node 'y' (Leaf 3.0) (Leaf 4.0)))
([1.0,3.0,4.0],"xy")
>

Since "fold" isn't powerful enough to write a tree pretty-printer with indentation, what high-order combinator is?

Given, for example, the following tree data type:
data Tree a = Node [Tree a] | Leaf a deriving Show
type Sexp = Tree String
How do I express a "pretty" function using an high-order combinator, that prints the tree with proper indentation? For example:
sexp =
Node [
Leaf "aaa",
Leaf "bbb",
Node [
Leaf "ccc",
Leaf "ddd",
Node [
Leaf "eee",
Leaf "fff"],
Leaf "ggg",
Leaf "hhh"],
Leaf "jjj",
Leaf "kkk"]
pretty = ????
main = print $ pretty sexp
I want the result of that program to be:
(aaa
bbb
(ccc
ddd
(eee
fff)
ggg
hhh)
jjj
kkk)
Here is an incomplete solution, using a "fold" as the combinator, that doesn't implement the indentation:
fold f g (Node children) = f (map (fold f g) children)
fold f g (Leaf terminal) = g terminal
pretty = fold (\ x -> "(" ++ (foldr1 ((++) . (++ " ")) x) ++ ")") show
main = putStrLn $ pretty sexp
It is obviously not possible to write the function I want using fold, since it forgets the tree structure. So, what is a proper high-order combinator that is generic enough to allow me to write the function I want, but less powerful than writing a direct recursive function?
fold is strong enough; the trick is that we'll need to instantiate r as a reader monad of the current indentation level.
fold :: ([r] -> r) -> (a -> r) -> (Tree a -> r)
fold node leaf (Node children) = node (map (fold node leaf) children)
fold node leaf (Leaf terminal) = leaf terminal
pretty :: forall a . Show a => Tree a -> String
pretty tree = fold node leaf tree 0 where
node :: [Int -> String] -> Int -> String
node children level =
let childLines = map ($ level + 1) children
in unlines ([indent level "Node ["] ++ childLines ++ [indent level "]"])
leaf :: a -> Int -> String
leaf a level = indent level (show a)
indent :: Int -> String -> String -- two space indentation
indent n s = replicate (2 * n) ' ' ++ s
Take careful note that I pass an extra parameter to the call to fold. This is the initial state of indentation and it works because with this specialization of r, fold returns a function.
It's simply
onLast f xs = init xs ++ [f (last xs)]
pretty :: Sexp -> String
pretty = unlines . fold (node . concat) (:[]) where
node [] = [""]
node (x:xs) = ('(' : x) : map (" " ++) (onLast (++ ")") xs)

Determining the extent of lazy evaluation

Given
data BTree a = End
| Node a (BTree a) (BTree a)
deriving(Show,Eq,Ord)
data Msg = Msg { from :: String
, to :: String
, when :: Int
, message :: String }
instance Ord Msg where
compare a b = (when a) `compare` (when b)
instance Eq Msg where
(==) a b = (when a) == (when b)
My function to count nodes (which seems off but that's aside from the question) is
count :: (Ord a) => (BTree a) -> Int
count = sum . count'
where
count' :: (Ord a) => (BTree a) -> [Int]
count' End = []
count' (Node _ l r) =
[1] ++ (count' l) ++ (count' r)
Does count not evaluate the contents of the Msg by virtue of its value being discarded by _? Perhaps a better question is, how do I know where lazy evaluation starts and ends for this sort of thing?
If the third line of count' was:
count' (Node (Msg x _ _ _) l r) =
Can I assume that the other three fields of Msg were accessed/evaluated, or does lazy evaluation go that far?
No. The fields of a data structure are evaluated lazily by default. Since you're not using the other fields in any way, they will not be evaluated by this code. If you want to make it so that evaluating a node forces all its fields to be evaluated, you can add strictness annotations to the fields:
data BTree a = End
| Node !a (BTree a) (BTree a)
deriving(Show,Eq,Ord)
data Msg = Msg { from :: !String
, to :: !String
, when :: !Int
, message :: !String }
Since counting the nodes forces the nodes themselves to be evaluated, this will also force the node values to be evaluated. If you only want this behavior for your one function, you can force evaluation in a more fine-grained manner using seq:
count' (Node x l r) = x `seq` ([1] ++ count' l ++ count' r)
or a bang pattern (requires the BangPatterns extension)
count' (Node !x l r) = [1] ++ count' l ++ count' r

Resources