haskell binary search tree

haskell binary search tree - haskell

module Main where
import Data.List
import Data.Function
type Raw = (String, String)
icards = [("the", "le"),("savage", "violent"),("work", "travail"),
("wild", "sauvage"),("chance", "occasion"),("than a", "qu'un")]
data Entry = Entry {wrd, def :: String, len :: Int, phr :: Bool}
deriving Show
-- French-to-English, search-tree section
entries' :: [Entry]
entries' = map (\(x, y) -> Entry y x (length y) (' ' `elem` y)) icards
data Tree a = Empty | Tree a (Tree a) (Tree a)
tree :: Tree Entry
tree = build entries'
build :: [Entry] -> Tree Entry
build [] = Empty
build (e:es) = ins e (build es)
ins :: Entry -> Tree Entry -> Tree Entry
...
find :: Tree Entry -> Word -> String
...
translate' :: String -> String
translate' = unwords . (map (find tree)) . words
so i'm trying to design function ins and find but i am not sure where to start.any ideas?

I have no idea by which criteria the tree should be sorted, so I use just wrd. Then it would look like:
ins :: Entry -> Tree Entry -> Tree Entry
ins entry Empty = Tree entry Empty Empty
ins entry#(Entry w _ _ _) (Tree current#(Entry w1 _ _ _) left right)
| w == w1 = error "duplicate entry"
| w < w1 = Tree current (ins entry left) right
| otherwise = Tree current left (ins entry right)
How to get there?
As always when using recursion, you need a base case. Here it is very simple: If the tree is empty, just replace it by a node containing your data. There are no children for the new node, so we use Empty.
The case if you have a full node looks more difficult, but this is just due to pattern matching, the idea is very simple: If the entry is "smaller" you need to replace the left child with a version that contains the entry, if it is "bigger" you need to replace the right child.
If both node and entry have the same "size" you have three options: keep the old node, replace it by the new one (keeping the children) or throw an error (which seems the cleanest solution, so I did it here).

A simple generalization of Landei's answer:
ins :: Ord a => a -> Tree a -> Tree a
ins x Empty = Tree x Empty Empty
ins x (Tree x' l r) = case compare x x' of
EQ -> undefined
LT -> Tree x' (ins x l) r
GT -> Tree x' l (ins x r)
For this to work on Tree Entry, you will need to define an instance of Ord for Entry.

Related

Huffman Tree decoding

Given a Huffman tree and a stream of bits, return a pair containing (1) the
-- string of symbols encoded by the bits (according to the Huffman tree), and
-- (2) a Bool indicating whether the output stream contains every bit from the
-- input (that is, return False if there were any bits left over).
Here is the code, it only returns the first symbol in the tree. What's the problem?
data BTree a = Leaf a | Fork (BTree a) (BTree a) deriving (Show, Eq)
traT :: BTree a -> BTree a -> [Bool] -> [a] -> ([a], Bool)
traT (Leaf v) c bs res= (res++[v], True)
traT (Fork left right) c (b:bs) res
| b = traT right c bs res
| otherwise = traT left c bs res
traT _ c [] res = (res, True)
traT _ c bs res = traT c c (bs) res
traT _ c bs res = (res, False)
decode :: BTree a -> [Bool] -> ([a], Bool)
decode (Fork x y) bs = traT (Fork x y) (Fork x y) bs []
decode (Leaf x) bs = traT(Leaf x) (Leaf x) bs []

Well, you seem to be on the right track.
it only returns the first symbol in the tree.
Your main problem is with these 2 lines:
traT (Leaf v) c bs res= (res++[v], True)
...
traT _ c bs res = traT c c (bs) res
The first one masks the second one for all leaf nodes. And the second one is your only forward recursive call that could operate at leaf nodes, hence your only hope to process any further bits.
A couple of remarks:
the res++[v] expression forces the code to rescan the whole symbol list at each new symbol.
The second line would call itself endlessly (but it is masked by the first one).
Another (smaller) problem is that returning just one flag for the presence of "extra" bits at the end of the bit stream loses information, as we would like to know what the extra bits are. It is a bit risky to do this in your core recursive function. Of course, it is perfectly OK to do it in the final outer decode function.
This is why in the code sample below, I have used an extra symBits argument to keep the bits that have been processed but not yet attributed to a symbol. I keep them in reverse order, because Haskell prefers to prepend items to a list, rather than to put them at the end, rescanning the whole list to do so. Hence the call to reverse in the final stage of processing. It is a cheap reverse call, as it is limited in length to the depth of our Huffman tree.
So here is some suggested reworked code, where I have tried to distinguish the 4 cases: leaf node or fork node AND at end of bit stream or not. I also took the liberty to rename your c argument as htop.
data BTree a = Leaf a | Fork (BTree a) (BTree a) deriving (Show, Eq)
type Bit = Bool
-- hnode htop symBits bs
travHT :: BTree a -> BTree a -> [Bit] -> [Bit] -> ([a], [Bit])
-- situations where at least one input bit remains:
travHT (Leaf v) htop symBits (b:rbs) = -- CHANGE: forward recursive call
-- symbol completed, jump from leaf node to top of htree:
let fwdRes = travHT htop htop [] (b:rbs)
nextSyms = fst fwdRes
lastSymBits = snd fwdRes
in (v : nextSyms, lastSymBits)
travHT (Fork left right) htop symBits (b:rbs)
| b = travHT right htop (b:symBits) rbs
| otherwise = travHT left htop (b:symBits) rbs
-- situations where we have reached the end of the bit stream:
travHT (Leaf v) htop symBits [] = ([v],[])
-- no more bits and not at a leaf --> incomplete last symbol:
travHT (Fork left right) htop symBits [] = ([], reverse symBits)
-- homework-mandated interface:
decode :: BTree a -> [Bit] -> ([a], Bool)
decode htree bs =
let pair = travHT htree htree [] bs
(symbols, restOfBits) = pair
weUsedAllBits = null restOfBits
in (symbols, weUsedAllBits)
Testing code with token main program:
xyz_code :: BTree Char
xyz_code = Fork (Leaf 'x') (Fork (Leaf 'y') (Leaf 'z'))
-- Bit streams for test purposes:
------ Y Z X X X Y/Z??
bl0 = [True,False, True,True , False, False, False]
bl1 = [True,False, True,True , False, False, False, True]
main = do
let bitList = bl0
let htree = xyz_code
let result = decode htree bitList
putStrLn $ "result = " ++ show result
Program output:
result = ("yzxxx",True)
Hope it helps. I will also ask the powers that be to add the [huffman-code] tag to your question. Tags are a nice way to help people find the questions of interest to them. And we do have a tag for Huffman codes.

Building a suffix tree by inserting each suffix in Haskell

I am working with the following data type:
data SuffixTree = Leaf Int | Node [(String, SuffixTree)]
deriving (Eq, Show)
Each subtree has a corresponding label (string).
The idea is to build the corresponding suffix tree by adding each suffix and its index into an accumulating tree (at the beginning it is Node []).
This is already defined
buildTree s
= foldl (flip insert) (Node []) (zip (suffixes s) [0..length s-1])
where suffixes is correctly defined.
I've been trying to implement the insert function for a while but can't seem to succeed.
This is what I have now (the names and style are not the best since this is still work in progress):
insert :: (String, Int) -> SuffixTree -> SuffixTree
insert pair tree#(Node content)
= insert' pair tree content
where
insert' :: (String, Int) -> SuffixTree -> [(String, SuffixTree)] -> SuffixTree
insert' (s, n) (Node []) subtrees
= Node ((s, Leaf n) : subtrees)
insert' (s, n) (Node content#((a, tree) : pairs)) subtrees
| null p = insert' (s, n) (Node pairs) subtrees
| p == a = insert' (r, n) tree subtrees
| p /= a = Node ((p, newNode) : (subtrees \\ [(a, tree)]))
where
(p, r, r') = partition s a
newNode = Node [(r, (Leaf n)), (r', tree)]
The partition function takes two strings and returns a tuple consisting of:
The common prefix (if it exists)
The first string without the prefix
The second string without the prefix
I think I understand the rules needed to build the tree.
We start by comparing the label of the first subtree to the string we want to insert (say, str). If they don't have a prefix in common, we try to insert in the next subtree.
If the label is a prefix of str, we continue to look into that subtree, but instead of using str we try to insert str without the prefix.
If str is a prefix of label, then we replace the existing subtree with a new Node, having a Leaf and the old subtree. We also adjust the labels.
If we don't have a match between str and any label then we add a new Leaf to the list of subtrees.
However, the biggest problem that I have is that I need to return a new tree containing the changes, so I have to keep track of everything else in the tree (not sure how to do this or if I'm thinking correctly about this).
The code appears to be working correctly on this string: "banana":
Node [("a",Node [("",Leaf 5),("na",Node [("",Leaf 3),("na",Leaf 1)])]),
("na",Node [("",Leaf 4),("na",Leaf 2)]),("banana",Leaf 0)]
However, on this string "mississippi" I get an Exception: Non-exhaustive patterns in function insert'.
Any help or ideas are greatly appreciated!

You are using a quadratic algorithm; whereas optimally, suffix tree can be constructed in linear time. That said, sticking with the same algorithm, a possibly better approach would be to first build the (uncompressed) suffix trie (not tree) and then compress the resulting trie.
The advantage would be that a suffix trie can be represented using Data.Map:
data SuffixTrie
= Leaf' Int
| Node' (Map (Maybe Char) SuffixTrie)
which makes manipulations both more efficient and easier than list of pairs. Doing so, you may also completely bypass common prefix calculations, as it comes out by itself:
import Data.List (tails)
import Data.Maybe (maybeToList)
import Control.Arrow (first, second)
import Data.Map.Strict (Map, empty, insert, insertWith, assocs)
data SuffixTree
= Leaf Int
| Node [(String, SuffixTree)]
deriving Show
data SuffixTrie
= Leaf' Int
| Node' (Map (Maybe Char) SuffixTrie)
buildTrie :: String -> SuffixTrie
buildTrie s = foldl go (flip const) (init $ tails s) (length s) $ Node' empty
where
go run xs i (Node' ns) = run (i - 1) $ Node' tr
where tr = foldr loop (insert Nothing $ Leaf' (i - 1)) xs ns
loop x run = insertWith (+:) (Just x) . Node' $ run empty
where _ +: Node' ns = Node' $ run ns
buildTree :: String -> SuffixTree
buildTree = loop . buildTrie
where
loop (Leaf' i) = Leaf i
loop (Node' m) = Node $ con . second loop <$> assocs m
con (Just x, Node [(xs, tr)]) = (x:xs, tr) -- compress single-child nodes
con n = maybeToList `first` n
then:
\> buildTree "banana"
Node [("a",Node [("",Leaf 5),
("na",Node [("",Leaf 3),
("na",Leaf 1)])]),
("banana",Leaf 0),
("na",Node [("",Leaf 4),
("na",Leaf 2)])]
similarly:
\> buildTree "mississippi"
Node [("i",Node [("",Leaf 10),
("ppi",Leaf 7),
("ssi",Node [("ppi",Leaf 4),
("ssippi",Leaf 1)])]),
("mississippi",Leaf 0),
("p",Node [("i",Leaf 9),
("pi",Leaf 8)]),
("s",Node [("i",Node [("ppi",Leaf 6),
("ssippi",Leaf 3)]),
("si",Node [("ppi",Leaf 5),
("ssippi",Leaf 2)])])]

Here's how the problem is occurring.
Let's say you're processing buildTree "nanny". After you've inserted the suffixes "nanny", "anny", and "nny", your tree looks like t1 given by:
let t1 = Node t1_content
t1_content = [("n",t2),("anny",Leaf 1)]
t2 = Node [("ny",Leaf 2),("anny",Leaf 0)]
Next, you try to insert the prefix "ny":
insert ("ny", 3) t1
= insert' ("ny", 3) t1 t1_content
-- matches guard p == a with p="n", r="y", r'=""
= insert' ("y", 3) t2 t1_content
What you intend to do next is insert ("y", 3) into t2 to yield:
Node [("y", Leaf 3), ("ny",Leaf 2),("anny",Leaf 0)])
Instead, what happens is:
insert' ("y", 3) t2 t1_content
-- have s="y", a="ny", so p="", r="y", r'="ny"
-- which matches guard: null p
= insert' ("y", 3) (Node [("anny", Leaf 0)]) t1_content
-- have s="y", a="anny", so p="", r="y", r'="anny"
-- which matches guard: null p
= insert' ("y", 3) (Node []) t1_content
= Node [("y", Leaf 3), ("n",t2), ("anny",Leaf 1)]
and suffix "y" has been added to t1 instead of t2.
When you next try to insert suffix "y", the guard p==a case tries to insert ("y",3) into Leaf 3 and you get a pattern error.
The reason it works on banana is that you only ever insert a new node at the top level of the tree, so "adding to t2" and "adding to t1" are the same thing.
I suspect you'll need to substantially rethink the structure of your recursion to get this working.

Looks like this code does the job, although there may still be improvements to make. I hope that it's general enough to work on any string. I also tried to avoid using ++, but it's still better than nothing.
getContent (Node listOfPairs)
= listOfPairs
insert :: (String, Int) -> SuffixTree -> SuffixTree
insert (s, n) (Node [])
= Node [(s, Leaf n)]
insert (s, n) (Node (pair#(a, tree) : pairs))
| p == a = Node ((a, insert (r, n) tree) : pairs)
| null p = Node (pair : (getContent (insert (r, n) (Node pairs))))
| p /= a = Node ([(p, Node [(r, Leaf n), (r', tree)])] ++ pairs)
where
(p, r, r') = partition s a

Finding element in a binary tree

Assume I have a binary tree:
data Bst a = Empty | Node (Bst a) a (Bst a)
I have to write a function that searches for a value and returns the number of its children. If there is no node with this value, it returns -1. I was trying to write both BFS and DFS, and I failed with both.

Pattern matching is your friend. Your Bst can either be Empty or a Node, so at the toplevel, your search function will be
search Empty = ...
search (Node left x right) = ...
Can an Empty tree possibly contain the target value? With a Node the target value, if present, will be either the node value (x above), in the left subtree, in the right subtree—or perhaps some combination of these.
By “return[ing] the number of its children,” I assume you mean the total number of descendants of the Bst rooted at a Node whose value is the target, which is an interesting combination of problems. You will want another function, say numChildren, whose definition uses pattern matching as above. Considerations:
How many descendants does an Empty tree have?
In the Node case, x doesn’t count because you want descendants. If only you had a function to count the number of children in the left and right subtrees …

Here is a way to do this. Breath-first search can actually be a bit tricky to implement and this solution (findBFS) has aweful complexity (appending to the list is O(n)) but you'll get the gist.
First I have decided to split out the finding functions to return the tree where the node element matches. That simplifies splitting out the counting function. Also, it is easier to return the number of elements than the number of descendants and return -1 in case not found, so the numDesc functions rely on the numElements function.
data Tree a = Empty
| Node a (Tree a) (Tree a)
numElements :: Tree a -> Int
numElements Empty = 0
numElements (Node _ l r) = 1 + numElements l + numElements r
findDFS :: Eq a => a -> Tree a -> Tree a
findDFS _ Empty = Empty
findDFS x node#(Node y l r) | x == y = node
| otherwise = case findDFS x l of
node'#(Node _ _ _) -> node'
Empty -> findDFS x r
findBFS :: Eq a => a -> [Tree a] -> Tree a
findBFS x [] = Empty
findBFS x ((Empty):ts) = findBFS x ts
findBFS x (node#(Node y _ _):ts) | x == y = node
findBFS x ((Node _ l r):ts) = findBFS x (ts ++ [l,r])
numDescDFS :: Eq a => a -> Tree a -> Int
numDescDFS x t = numElements (findDFS x t) - 1
numDescBFS :: Eq a => a -> Tree a -> Int
numDescBFS x t = numElements (findBFS x [t]) - 1

Searching rose tree in Haskell

I'm trying to write a function searching for a given element in a rose tree and returning it's location.
It may be clearer when I show you what I already got:
Given a tree with a definition:
data Tree text = Node value
[Tree value]
for example:
test = Node "1" [
Node "11" [
Node "111" [],
Node "112" [
Node "1121" [], Node "1122" [], Node "1123" []
]
],
Node "12" []
]
1
11 12
111 112
1121 1122 1123
I'm looking for a function search:
search :: String -> Tree String -> [Integer]
search 1123 test -> should return [1,2,3]
- first subtree of 1=11 -> 2nd subtree of 11=112, 3rd subtree of 112=1123
I know how to iterate through tree,
display (Node v xs) = v ++ concatMap display xs
But have no idea how can I assign integer value to every element of subtrees array and additionally pass it recursively from upper to lower parts of the tree.
Can you guys direct me where/how to look for a solution? I'm very new to Haskell..

The easiest way is to let the function return the list of all paths to a node with the desired data (there should only ever be at most one in the tree, I suppose, but that doesn't matter) first, and then use the first of these:
searchList :: (Eq a) => a -> Tree a -> [[Integer]]
searchList val (Node dat subs)
| val == dat = [[]] -- empty path
| otherwise = concat [map (c:) (searchList val t) | (c,t) <- zip [1 .. ] subs]
search :: Eq a => a -> Tree a -> [Integer]
search val t = case searchList val t of
(p:_) -> p
_ -> error "Value not found"
If Daniel Wagner's suspicion is correct and your trees are tries, you can search more efficiently, but the principle remains the same, however, since we now know that we either have one node with the desired data or none, the result is more appropriately a Maybe [Integer]:
import Data.List (isPrefixOf)
import Control.Monad -- for the MonadPlus instance of Maybe
searchTrie :: String -> Tree String -> Maybe [Integer]
searchTrie target (Node val subs)
| val == target = Just []
| val `isPrefixOf` target = case dropWhile smaller (zip [1 .. ] subs) of
((c,t):_) -> fmap (c:) $ searchTrie target t
_ -> Nothing
| otherwise = Nothing
where
smaller (_,Node v _) = v < take (length v) target

Haskell, creating a binary search tree from a list

Can someone tell me why this code isn't producing what I want.
data BST = MakeNode BST String BST
| Empty
add :: String -> BST -> BST
add new Empty = (MakeNode Empty new Empty)
add string tree#(MakeNode left value right)
| string > value = MakeNode left value (add string right)
| string < value = MakeNode (add string left) value right
| otherwise = tree
output
"John"
"Doug"
"Charlie"
"Alice"
listToBST :: [String] -> BST
listToBST = foldr add Empty

If we create and function which takes a BST and returns a list in sorted order, modelled after sort . nub, then your Tree is fine as quickcheck tells us. QuickCheck is very easy to use.
import Data.List
import Test.QuickCheck
data BST = MakeNode BST String BST
| Empty
deriving (Show)
add :: String -> BST -> BST
add new Empty = (MakeNode Empty new Empty)
add string tree#(MakeNode left value right)
| string > value = MakeNode left value (add string right)
| string < value = MakeNode (add string left) value right
| otherwise = tree
test = ["alice", "blup", "test", "aa"]
manual_test = inorder (foldr add Empty test) == sort (nub test)
prop_inorder = property inorder_test
where inorder_test :: [String] -> Bool
inorder_test xs = inorder (foldr add Empty xs) == sort (nub xs)
-- return sorted nodes
inorder :: BST -> [String]
inorder (Empty) = []
inorder (MakeNode l x r) = inorder l ++ (x : inorder r)
Just load ghci and then run quickCheck prop_inorder.
Other useful functions are:
reverseOrder :: BST -> [String]
reverseOrder Empty = []
reverseOrder (MakeNode l x r) = reverseOrder r ++ (x : reverseOrder r)
asList :: BST -> [String]
asList Empty = []
asList (MakeNode l x r) = x : (asList l ++ asList r)
And also think about making your tree more general by parameterizing over a:
data BST a = Empty | MakeNode (BST a) a (BST a)
You can make it than an instance of Functor, Monad, Foldable and all kind of handy typeclasses.

I tried it and it seems ok to me. It could help if you gave an example of an input that it doesn't work for.
I think the problem may be that string comparison does not work the way you expect ("123" < "7" because "1" < "7"). If I'm right, you might want to use Ints instead of Strings or even better, the class Ord of all the types that can be ordered using (<).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

haskell binary search tree - haskell

Related

Huffman Tree decoding

Building a suffix tree by inserting each suffix in Haskell

Finding element in a binary tree

Searching rose tree in Haskell

Haskell, creating a binary search tree from a list

Categories

Resources