Haskell add value to tree - haskell

Im trying to make a funciton which allows me to add a new value to a tree IF the value at the given path is equal to ND (no data), this was my first attempt.
It checks the value etc, but the problem, is i want to be able to print the modified tree with the new data. can any one give me any pointers? I have also tried making a second function that checks the path to see if its ok to add data, but im just lost to how to print out the modified tree?

As iuliux points out, your problem is that you are treating your BTree as though it were a mutable structure. Remember functions in haskell take arguments and return a value. That is all. So when you "map over" a list, or traverse a tree your function needs to return a new tree.
The code you have is traversing the recursive tree and only returning the last leaf. Imagine for now that the leaf at the end of the path will always be ND. This is what you want:
add :: a -> Path -> Btree a -> Btree a
add da xs ND = Data da
add _ [] _ = error "You should make sure this doesn't happen or handle it"
add da (x:xs) (Branch st st2) =
case x of
L -> Branch (add da xs st) st2
R -> Branch st (add da xs st2)
Notice how in your original code you discard the Branch you pattern match against, when what you need to do is return it "behind you" as it were.
Now, on to the issue of handling situations where the leaf you arrive it is not a ND constructor:
This type of problem is common in functional programming. How can you return your recursive data structure "as you go" when the final result depends on a leaf far down the tree?
One solution for the trickiest of cases is the Zipper, which is a data structure that lets you go up down and sideways as you please. For your case that would be overkill.
I would suggest you change your function to the following:
add :: a -> Path -> Btree a -> Maybe (Btree a)
which means at each level you must return a Maybe (Btree a). Then use the Functor instance of Maybe in your recursive calls. Notice:
fmap (+1) (Just 2) == Just 3
fmap (+1) (Nothing) == Nothing
You should try to puzzle out the implementation for yourself!

I'm no expert in Haskell, but functional programming only works with functions. So kind of anything is a function.
Now, your function takes some input and returns something, not modifing the input. You have to retain the returned tree somewhere and that will be your new tree, the one with inserted element in it

We really need to see the Path and Error data types to answer your question, but you can print out your trees using the IO Monad:
main :: IO()
main = do let b = Branch ND (Branch (Data 1) (Data 2))
let b1 = add 10 [L] b --actual call depends on definition of Path
(putStrLn . show) b1

Related

Accessing values in haskell custom data type

I'm very new to haskell and need to use a specific data type for a problem I am working on.
data Tree a = Leaf a | Node [Tree a]
deriving (Show, Eq)
So when I make an instance of this e.g Node[Leaf 1, Leaf2, Leaf 3] how do I access these? It won't let me use head or tail or indexing with !! .
You perform pattern matching. For example if you want the first child, you can use:
firstChild :: Tree a -> Maybe (Tree a)
firstChild (Node (h:_)) = Just h
firstChild _ = Nothing
Here we wrap the answer in a Maybe type, since it is possible that we process a Leaf x or a Node [], such that there is no first child.
Or we can for instance obtain the i-th item with:
iThChild :: Int -> Tree a -> Tree a
iThChild i (Node cs) = cs !! i
So here we unwrap the Node constructor, obtain the list of children cs, and then perform cs !! i to obtain the i-th child. Note however that (!!) :: [a] -> Int -> a is usually a bit of an anti-pattern: it is unsafe, since we have no guarantees that the list contains enough elements, and using length is an anti-pattern as well, since the list can have infinite length, so we can no do such bound check.
Usually if one writes algorithms in Haskell, one tends to make use of linear access, and write total functions: functions that always return something.

What is required to implement an ADT in Clojure?

Assumption: I'm aware of the ADT libraries here. They're cool. Maybe they could be better.
There is a really interesting example of ADT's in Clojure here:
We define an ADT generator like this:
(defmacro data
[adt-name equals-sign & constructors]
`(do
(defn ~(symbol (str adt-name "?")) [~'obj]
(= ~(str adt-name) (adt-name ~'obj)))
~#(for [[type-name & fields]
(filter (partial not= '(|))
(partition-by (partial = '|) constructors))]
(apply (partial emit-constructor adt-name type-name)
fields))))
Given the Haskell example:
data Tree a = Empty
| Leaf a
| Node Tree Tree
Then we write the Clojure
(data Tree = Empty | Leaf value | Node left right)
Which is pretty cool.
Now I feel like there is something missing from matching up to the Haskell equivalent, but I can't quite put my finger on what it is.
My question is: What is required to implement an ADT in Clojure?
To implement ADT in clojure you're required to be brave and insistent.
For the missing parts - I don't know what are you missing, but I know what I am missing usually.
1) I want to authomatically get some foldX-function to perform conversion to Boehm encoding - a natural fold for this datatype.
This, however, will require you to have user to specify which fields must refer to object of same type (left and right in your case).
For instance, that function, written for your example type in haskell (God save the laziness!) will look like:
foldTree :: a -> (v -> a) -> (a -> a -> a) -> Tree v -> a
foldTree empty value node = go
where
go tree =
case tree of
Empty -> empty
Value v -> value v
Node l r -> node (go l) (go r)
This is done in Coq, as I know, and called "induction".
2) I want to see predicates like isEmpty for all the branches. Seriously. The only language providing them is Pyret.
3) For bonus points, I also want to have some ability to derive structural Equality, Ordering, to- and from-string conversion.
∞-1) To own my soul, you can also automatically generate lenses and prisms into all fields and branches accordingly.
∞) To prove your own strength, you can also generate ana-, para- and apomorphisms, since foldX is a already a catamorphism.

How to go up on a tree in haskell?

Maybe the question is not suiting the topic i am going to approach
But i will try to explain the best i can:
I have a genealogical tree that has this structure:
data GT = Person Name Father Mother
| unknown
type Father = GT
type Mother = GT
type Name = String
I need to find out the grandfathers of a given name:
grandfathers :: Name -> GT -> [Name]
this is the best i could do:
grandfathers :: Name -> GT -> [Name]
grandfathers s (Person x f m) = if (searchgrandson s f 0) then [x] else (grandfathers s f)
where
searchgrandson s unknown k = False
searchgrandson s (Person x f m) k = if s==x && k<2 then True
else searchgrandson s f (k+1)
of course that for this tree it works, because my code goes all the way through the left side, ignoring the mother side, and giving only the grandfather, and not the grandmother right?
$grandfathers "grandson" (Person "grandpa" (Person "son" (Person "grandson" unknown unknown ) unknown ) unknown )
["grandpa"]
EDIT:
after following the dfeuer advice:
avos_ :: Nome -> AG -> Int
avos_ s Desconhecida = 0
avos_ s l#(Pessoa x p m) = encontraneto s l 0
where
encontraneto s Desconhecida k = 0
encontraneto s (Pessoa x p m) k = if s==x then k
else encontraneto s p (k+1) + encontraneto s m (k+1)
This gives me the depth of the tree where the grandson is, searching both sides. After this would be simple...
Thanks!
A good way to go about solving this problem is to decompose it into two functions. One of the functions will search the tree for zero or more matches against a name and return zero or more trees, starting from the place where the match occurred; the root of each tree will be the person that matches the name. The other function will simply take a tree and return zero, one, two, three or four names corresponding to the grandparents of the person at the root of the tree.
The types of the first function is easy:
searchPerson :: Name -> GT -> [GT]
The type of the second function is also straightforward
getGrandparents :: GT -> [Name]
Now, since the first function returns a list of GT, rather than a single GT, the second function should be mapped over the results of the first function (or, said other way, lifted to operate on lists of GT):
map getGrandparents (searchPerson x myGenTree)
Writing the second function (getGrandParents) is trivial and can be achieved with pattern matching, deconstructing the GT type up to the second level of nesting, but you would probably like to save yourself some typing and create a helper function to operate on each parent and return their parents.
The first function (searchPerson) is also trivial, and can be done in a couple of ways. One way is to simply use recursion looking for a match on a name and returning a list of all the subtrees that had a match. The other option is to simply return a list of every possible subtree in that tree, starting from any possible offset, and to use a filter to only keep those subtrees whose root is a match. The second option is more "wholesome" and equivalent to addressing a similar problem in lists (the function that returns all sublists is tails) so would probably appeal to some people more. You may think that the second approach is wasteful, but Haskell immutability and lazyness should make it pretty efficient too.
I quickly put together the code in my fpcomplete account and will gladly post the URL in a comment below if you ask, but I don't want to spoil the fun of coming up with the solution on your own.

Is there a way to avoid copying the whole search path of a binary tree on insert?

I've just started working my way through Okasaki's Purely Functional Data Structures, but have been doing things in Haskell rather than Standard ML. However, I've come across an early exercise (2.5) that's left me a bit stumped on how to do things in Haskell:
Inserting an existing element into a binary search tree copies the entire search path
even though the copied nodes are indistinguishable from the originals. Rewrite insert using exceptions to avoid this copying. Establish only one handler per insertion rather than one handler per iteration.
Now, my understanding is that ML, being an impure language, gets by with a conventional approach to exception handling not so different to, say, Java's, so you can accomplish it something like this:
type Tree = E | T of Tree * int * Tree
exception ElementPresent
fun insert (x, t) =
let fun go E = T (E, x, E)
fun go T(l, y, r) =
if x < y then T(go (l), x, r)
else if y < x then T(l, x, go (r))
else raise ElementPresent
in go t
end
handle ElementPresent => t
I don't have an ML implementation, so this may not be quite right in terms of the syntax.
My issue is that I have no idea how this can be done in Haskell, outside of doing everything in the IO monad, which seems like cheating and even if it's not cheating, would seriously limit the usefulness of a function which really doesn't do any mutation. I could use the Maybe monad:
data Tree a = Empty | Fork (Tree a) a (Tree a)
deriving (Show)
insert :: (Ord a) => a -> Tree a -> Tree a
insert x t = maybe t id (go t)
where go Empty = return (Fork Empty x Empty)
go (Fork l y r)
| x < y = do l' <- go l; return (Fork l' y r)
| x > y = do r' <- go r; return (Fork l y r')
| otherwise = Nothing
This means everything winds up wrapped in Just on the way back up when the element isn't found, which requires more heap allocation, and sort of defeats the purpose. Is this allocation just the price of purity?
EDIT to add: A lot of why I'm wondering about the suitability of the Maybe solution is that the optimization described only seems to save you all the constructor calls you would need in the case where the element already exists, which means heap allocations proportional to the length of the search path. The Maybe also avoids those constructor calls when the element already exists, but then you get a number of Just constructor calls equal to the length of the search path. I understand that a sufficiently smart compiler could elide all the Just allocations, but I don't know if, say, the current version of GHC is really that smart.
In terms of cost, the ML version is actually very similar to your Haskell version.
Every recursive call in the ML version results in a stack frame. The same is true in the
Haskell version. This is going to be proportional in size to the path that you traverse in
the tree. Also, both versions will of course allocate new nodes for the entire path if an insertion is actually performed.
In your Haskell version, every recursive call might also eventually result in the
allocation of a Just node. This will go on the minor heap, which is just a block of
memory with a bump pointer. For all practical purposes, GHC's minor heap is roughly equivalent in
cost to the stack. Since these are short-lived allocations, they won't normally end up
being moved to the major heap at all.
GHC generally cannot elide path copying in cases like that. However, there is a way to do it manually, without incurring any of the indirection/allocation costs of Maybe. Here it is:
{-# LANGUAGE MagicHash #-}
import GHC.Prim (reallyUnsafePtrEquality#)
data Tree a = Empty | Fork (Tree a) a (Tree a)
deriving (Show)
insert :: (Ord a) => a -> Tree a -> Tree a
insert x Empty = Fork Empty x Empty
insert x node#(Fork l y r)
| x < y = let l' = insert x l in
case reallyUnsafePtrEquality# l l' of
1# -> node
_ -> Fork l' y r
| x > y = let r' = insert x r in
case reallyUnsafePtrEquality# r r' of
1# -> node
_ -> Fork l y r'
| otherwise = node
The pointer equality function does exactly what's in the name. Here it is safe because even if the equality returns a false negative we only do a bit of extra copying, and nothing worse happens.
It's not the most idiomatic or prettiest Haskell, but the performance benefits can be significant. In fact, this trick is used very frequently in unordered-containers.
As fizruk indicates, the Maybe approach is not significantly different from what you'd get in Standard ML. Yes, the whole path is copied, but the new copy is discarded if it turns out not to be needed. The Just constructor itself may not even be allocated on the heap—it can't escape from insert, let alone the module, and you don't do anything weird with it, so the compiler is free to analyze it to death.
Edit
There are efficiency problems, now that I think of it. Your use of Maybe conceals the fact that you're actually making two passes—one down to find the insertion point and one up to build the tree. The solution to this is to drop Maybe Tree in favor of (Tree,Bool) and use strictness annotations, or to switch to continuation-passing style. Also, if you choose to stay with the three-way logic, you may want to use the three-way comparison function. Alternatively, you can go all the way to the bottom each time and check later if you hit a duplicate.
If you have a predicate that checks whether the key is already in the tree, you can look before you leap:
insert x t = if contains t x then t else insert' x t
This traverses the tree twice, of course. Whether that's as bad as it sounds should be determined empirically: it might just load the relevant part of the tree into the cache.

Tree Fold operation?

I am taking a class in Haskell, and we need to define the fold operation for a tree defined by:
data Tree a = Lf a | Br (Tree a) (Tree a)
I can not seem to find any information on the "tfold" operation or really what it supposed to do. Any help would be greatly appreciated.
I always think of folds as a way of systematically replacing constructors by other functions. So, for instance, if you have a do-it-yourself List type (defined as data List a = Nil | Cons a (List a)), the corresponding fold can be written as:
listfold nil cons Nil = nil
listfold nil cons (Cons a b) = cons a (listfold nil cons b)
or, maybe more concisely, as:
listfold nil cons = go where
go Nil = nil
go (Cons a b) = cons a (go b)
The type of listfold is b -> (a -> b -> b) -> List a -> b. That is to say, it takes two 'replacement constructors'; one telling how a Nil value should be transformed into a b, another replacement constructor for the Cons constructor, telling how the first value of the Cons constructor (of type a) should be combined with a value of type b (why b? because the fold has already been applied recursively!) to yield a new b, and finally a List a to apply the whole she-bang to - with a result of b.
In your case, the type of tfold should be (a -> b) -> (b -> b -> b) -> Tree a -> b by analogous reasoning; hopefully you'll be able to take it from there!
Imagine you define that a tree should be shown in the following manner,
<1 # <<2#3> # <4#5>>>
Folding such a tree means replacing each branch node with an actual supplied operation to be performed on the results of fold recursively performed on the data type's constituents (here, the node's two child nodes, which are themselves, each, a tree), for example with +, producing
(1 + ((2+3) + (4+5)))
So, for leaves you should just take the values inside them, and for branches, recursively apply the fold for each of the two child nodes, and combine the two results with the supplied function, the one with which the tree is folded. (edit:) When "taking" values from leaves, you could additionally transform them, applying a unary function. So in general, your folding will need two user-provided functions, one for leaves, Lf, and another one for combining the results of recursively folding the tree-like constituents (i.e. branches) of the branching nodes, Br.
Your tree data type could have been defined differently, e.g. with possibly empty leaves, and with internal nodes also carrying the values. Then you'd have to provide a default value to be used instead of the empty leaf nodes, and a three-way combination operation. Still you'd have the fold defined by two functions corresponding to the two cases of the data type definition.
Another distinction to realize here is, what you fold, and how you fold it. I.e. you could fold your tree in a linear fashion, (1+(2+(3+(4+5)))) == ((1+) . (2+) . (3+) . (4+) . (5+)) 0, or you could fold a linear list in a tree-like fashion, ((1+2)+((3+4)+5)) == (((1+2)+(3+4))+5). It is all about how you parenthesize the resulting "expression". Of course in the classic take on folding the expression's structure follows that of the data structure being folded; but variations do exist. Note also, that the combining operation might not be strict, and the "result" type it consumes/produces might express compound (lists and such), as well as atomic (numbers and such), values.
(update 2019-01-26) This re-parenthesization is possible if the combining operation is associative, like +: (a1+a2)+a3 == a1+(a2+a3). A data type together with such associative operation and a "zero" element (a+0 == 0+a == a) is known as "Monoid", and the notion of folding "into" a Monoid is captured by the Foldable type class.
A fold on a list is a reduction from a list into a single element. It takes a function and then applies that function to elements, two at a time, until it has only one element. For example:
Prelude> foldl1 (+) [3,5,6,7]
21
...is found by doing operations one-by-one:
3 + 5 == 8
8 + 6 == 14
14 + 7 == 21
A fold can be written
ourFold :: (a -> a -> a) -> [a] -> a
ourFold _ [a] = a -- pattern-match for a single-element list. Our work is done.
ourFold aFunction (x0:x1:xs) = ourFold aFunction ((aFunction x0 x1):xs)
A tree fold would do this, but move up or down the branches of the tree. To do this, it first need to pattern-match to see whether you're operating on a Leaf or a Branch.
treeFold _ (Lf a) = Lf a -- You can't do much to a one-leaf tree
treeFold f (Br a b) = -- ...
The rest is left up to you, since it's homework. If you're stuck, try first thinking of what the type should be.
A fold is an operation which "compacts" a data structure into a single value using an operation. There are variations depending if you have a start value and execution order (e.g. for lists you have foldl, foldr, foldl1 and foldr1), so the correct implementation depends on your assignment.
I guess your tfold should simply replace all leafs with its values, and all branches with applications of the given operation. Draw an example tree with some numbers, an "collapse" him given an operation like (+). After this, it should be easy to write a function doing the same.

Resources