What is required to implement an ADT in Clojure? - haskell

Assumption: I'm aware of the ADT libraries here. They're cool. Maybe they could be better.
There is a really interesting example of ADT's in Clojure here:
We define an ADT generator like this:
(defmacro data
[adt-name equals-sign & constructors]
`(do
(defn ~(symbol (str adt-name "?")) [~'obj]
(= ~(str adt-name) (adt-name ~'obj)))
~#(for [[type-name & fields]
(filter (partial not= '(|))
(partition-by (partial = '|) constructors))]
(apply (partial emit-constructor adt-name type-name)
fields))))
Given the Haskell example:
data Tree a = Empty
| Leaf a
| Node Tree Tree
Then we write the Clojure
(data Tree = Empty | Leaf value | Node left right)
Which is pretty cool.
Now I feel like there is something missing from matching up to the Haskell equivalent, but I can't quite put my finger on what it is.
My question is: What is required to implement an ADT in Clojure?

To implement ADT in clojure you're required to be brave and insistent.
For the missing parts - I don't know what are you missing, but I know what I am missing usually.
1) I want to authomatically get some foldX-function to perform conversion to Boehm encoding - a natural fold for this datatype.
This, however, will require you to have user to specify which fields must refer to object of same type (left and right in your case).
For instance, that function, written for your example type in haskell (God save the laziness!) will look like:
foldTree :: a -> (v -> a) -> (a -> a -> a) -> Tree v -> a
foldTree empty value node = go
where
go tree =
case tree of
Empty -> empty
Value v -> value v
Node l r -> node (go l) (go r)
This is done in Coq, as I know, and called "induction".
2) I want to see predicates like isEmpty for all the branches. Seriously. The only language providing them is Pyret.
3) For bonus points, I also want to have some ability to derive structural Equality, Ordering, to- and from-string conversion.
∞-1) To own my soul, you can also automatically generate lenses and prisms into all fields and branches accordingly.
∞) To prove your own strength, you can also generate ana-, para- and apomorphisms, since foldX is a already a catamorphism.

Related

How to sum tree elements using folds?

I'm trying to learn the 'folds' (only 'foldr' and 'foldl') functionality of Haskell through doing some sample coding. I have defined a Tree (not binary) like so:
data NTree a = Nil | Tree a [NTree a] deriving Show
I want to sum all the elements of the tree using a function. I have outlined the type signature and the base case of the function, but I'm not sure how to implement the logic itself using folds. This is what I have so far:
sumElements :: NTree Int -> Int
sumElements Nil = 0
sumElements tree = foldr (???) 0 tree
I really can't think of how to get started. Any help filling in the folds function would be appreciated.
You pretty much have it.
sumElements tree = foldr (+) 0 tree
In order to apply foldr to your tree, you should define an instance of Foldable for your Tree.
In short, you have to supply an implementation the 2 functions required for a data type to be "foldable" : foldMap and foldr.
You can learn more in this tutorial.
(Im also a begginer, I hope this will help you and others)

The simplest way to generically traverse a tree in haskell

Suppose I used language-javascript library to build AST in Haskell. The AST has nodes of different types, and each node can have fields of those different types.
And each type can have numerous constructors. (All the types instantiate Data, Eq and Show).
I would like to count each type's constructor occurrence in the tree. I could use toConstr to get the constructor, and ideally I'd make a Tree -> [Constr] function fisrt (then counting is easy).
There are different ways to do that. Obviously pattern matching is too verbose (imagine around 3 types with 9-28 constructors).
So I'd like to use a generic traversal, and I tried to find the solution in SYB library.
There is an everywhere function, which doesn't suit my needs since I don't need a Tree -> Tree transformation.
There is gmapQ, which seems suitable in terms of its type, but as it turns out it's not recursive.
The most viable option so far is everywhereM. It still does the useless transformation, but I can use a Writer to collect toConstr results. Still, this way doesn't really feel right.
Is there an alternative that will not perform a useless (for this task) transformation and still deliver the list of constructors? (The order of their appearance in the tree doesn't matter for now)
Not sure if it's the simplest, but:
> data T = L | B T T deriving Data
> everything (++) (const [] `extQ` (\x -> [toConstr (x::T)])) (B L (B (B L L) L))
[B,L,B,B,L,L,L]
Here ++ says how to combine the results from subterms.
const [] is the base case for subterms who are not of type T. For those of type T, instead, we apply \x -> [toConstr (x::T)].
If you have multiple tree types, you'll need to extend the query using
const [] `extQ` (handleType1) `extQ` (handleType2) `extQ` ...
This is needed to identify the types for which we want to take the constructors. If there are a lot of types, probably this can be made shorter in some way.
Note that the code above is not very efficient on large trees since using ++ in this way can lead to quadratic complexity. It would be better, performance wise, to return a Data.Map.Map Constr Int. (Even if we do need to define some Ord Constr for that)
universe from the Data.Generics.Uniplate.Data module can give you a list of all the sub-trees of the same type. So using Ilya's example:
data T = L | B T T deriving (Data, Show)
tree :: T
tree = B L (B (B L L) L)
λ> import Data.Generics.Uniplate.Data
λ> universe tree
[B L (B (B L L) L),L,B (B L L) L,B L L,L,L,L]
λ> fmap toConstr $ universe tree
[B,L,B,B,L,L,L]

Factoring out recursion in a complex AST

For a side project I am working on I currently have to deal with an abstract syntax tree and transform it according to rules (the specifics are unimportant).
The AST itself is nontrivial, meaning it has subexpressions which are restricted to some types only. (e.g. the operator A must take an argument which is of type B only, not any Expr. A drastically simplified reduced version of my datatype looks like this:
data Expr = List [Expr]
| Strange Str
| Literal Lit
data Str = A Expr
| B Expr
| C Lit
| D String
| E [Expr]
data Lit = Int Int
| String String
My goal is to factor out the explicit recursion and rely on recursion schemes instead, as demonstrated in these two excellent blog posts, which provide very powerful general-purpose tools to operate on my AST. Applying the necessary factoring, we end up with:
data ExprF a = List [a]
| Strange (StrF a)
| Literal (LitF a)
data StrF a = A a
| B a
| C (LitF a)
| D String
| E [a]
data LitF a = Int Int
| String String
If I didn't mess up, type Expr = Fix ExprF should now be isomorphic to the previously defined Expr.
However, writing cata for these cases becomes rather tedious, as I have to pattern match B a :: StrF a inside of an Str :: ExprF a for cata to be well-typed. For the entire original AST this is unfeasible.
I stumbled upon fixing GADTs, which seems to me like it is a solution to my problem, however the user-unfriendly interface of the duplicated higher-order type classes etc. is quite the unneccessary boilerplate.
So, to sum up my questions:
Is rewriting the AST as a GADT the correct way to go about this?
If yes, how could I transform the example into a well-working version? On a second note, is there better support for higher kinded Functors in GHC now?
If you've gone through the effort of to separate out the recursion in your data type, then you can just derive Functor and you're done. You don't need any fancy features to get the recursion scheme. (As a side note, there's no reason to parameterize the Lit data type.)
The fold is:
newtype Fix f = In { out :: f (Fix f) }
gfold :: (Functor f) => (f a -> a) -> Fix f -> a
gfold alg = alg . fmap (gfold alg) . out
To specify the algebra (the alg parameter), you need to do a case analysis against ExprF, but the alternative would be to have the fold have a dozen or more parameters: one for each data constructor. That wouldn't really save you much typing and would be much harder to read. If you want (and this may require rank-2 types in general), you can package all those parameters up into a record and then you could use record update to update "pre-made" records that provide "default" behavior in various circumstances. There's an old paper Dealing with Large Bananas that takes an approach like this. What I'm suggesting, to be clear, is just wrapping the gfold function above with a function that takes a record, and passes in an algebra that will do the case analysis and call the appropriate field of the record for each case.
Of course, you could use GHC Generics or the various "generic/polytypic" programming libraries like Scrap Your Boilerplate instead of this. You are basically recreating what they do.

Tree Fold operation?

I am taking a class in Haskell, and we need to define the fold operation for a tree defined by:
data Tree a = Lf a | Br (Tree a) (Tree a)
I can not seem to find any information on the "tfold" operation or really what it supposed to do. Any help would be greatly appreciated.
I always think of folds as a way of systematically replacing constructors by other functions. So, for instance, if you have a do-it-yourself List type (defined as data List a = Nil | Cons a (List a)), the corresponding fold can be written as:
listfold nil cons Nil = nil
listfold nil cons (Cons a b) = cons a (listfold nil cons b)
or, maybe more concisely, as:
listfold nil cons = go where
go Nil = nil
go (Cons a b) = cons a (go b)
The type of listfold is b -> (a -> b -> b) -> List a -> b. That is to say, it takes two 'replacement constructors'; one telling how a Nil value should be transformed into a b, another replacement constructor for the Cons constructor, telling how the first value of the Cons constructor (of type a) should be combined with a value of type b (why b? because the fold has already been applied recursively!) to yield a new b, and finally a List a to apply the whole she-bang to - with a result of b.
In your case, the type of tfold should be (a -> b) -> (b -> b -> b) -> Tree a -> b by analogous reasoning; hopefully you'll be able to take it from there!
Imagine you define that a tree should be shown in the following manner,
<1 # <<2#3> # <4#5>>>
Folding such a tree means replacing each branch node with an actual supplied operation to be performed on the results of fold recursively performed on the data type's constituents (here, the node's two child nodes, which are themselves, each, a tree), for example with +, producing
(1 + ((2+3) + (4+5)))
So, for leaves you should just take the values inside them, and for branches, recursively apply the fold for each of the two child nodes, and combine the two results with the supplied function, the one with which the tree is folded. (edit:) When "taking" values from leaves, you could additionally transform them, applying a unary function. So in general, your folding will need two user-provided functions, one for leaves, Lf, and another one for combining the results of recursively folding the tree-like constituents (i.e. branches) of the branching nodes, Br.
Your tree data type could have been defined differently, e.g. with possibly empty leaves, and with internal nodes also carrying the values. Then you'd have to provide a default value to be used instead of the empty leaf nodes, and a three-way combination operation. Still you'd have the fold defined by two functions corresponding to the two cases of the data type definition.
Another distinction to realize here is, what you fold, and how you fold it. I.e. you could fold your tree in a linear fashion, (1+(2+(3+(4+5)))) == ((1+) . (2+) . (3+) . (4+) . (5+)) 0, or you could fold a linear list in a tree-like fashion, ((1+2)+((3+4)+5)) == (((1+2)+(3+4))+5). It is all about how you parenthesize the resulting "expression". Of course in the classic take on folding the expression's structure follows that of the data structure being folded; but variations do exist. Note also, that the combining operation might not be strict, and the "result" type it consumes/produces might express compound (lists and such), as well as atomic (numbers and such), values.
(update 2019-01-26) This re-parenthesization is possible if the combining operation is associative, like +: (a1+a2)+a3 == a1+(a2+a3). A data type together with such associative operation and a "zero" element (a+0 == 0+a == a) is known as "Monoid", and the notion of folding "into" a Monoid is captured by the Foldable type class.
A fold on a list is a reduction from a list into a single element. It takes a function and then applies that function to elements, two at a time, until it has only one element. For example:
Prelude> foldl1 (+) [3,5,6,7]
21
...is found by doing operations one-by-one:
3 + 5 == 8
8 + 6 == 14
14 + 7 == 21
A fold can be written
ourFold :: (a -> a -> a) -> [a] -> a
ourFold _ [a] = a -- pattern-match for a single-element list. Our work is done.
ourFold aFunction (x0:x1:xs) = ourFold aFunction ((aFunction x0 x1):xs)
A tree fold would do this, but move up or down the branches of the tree. To do this, it first need to pattern-match to see whether you're operating on a Leaf or a Branch.
treeFold _ (Lf a) = Lf a -- You can't do much to a one-leaf tree
treeFold f (Br a b) = -- ...
The rest is left up to you, since it's homework. If you're stuck, try first thinking of what the type should be.
A fold is an operation which "compacts" a data structure into a single value using an operation. There are variations depending if you have a start value and execution order (e.g. for lists you have foldl, foldr, foldl1 and foldr1), so the correct implementation depends on your assignment.
I guess your tfold should simply replace all leafs with its values, and all branches with applications of the given operation. Draw an example tree with some numbers, an "collapse" him given an operation like (+). After this, it should be easy to write a function doing the same.

Haskell add value to tree

Im trying to make a funciton which allows me to add a new value to a tree IF the value at the given path is equal to ND (no data), this was my first attempt.
It checks the value etc, but the problem, is i want to be able to print the modified tree with the new data. can any one give me any pointers? I have also tried making a second function that checks the path to see if its ok to add data, but im just lost to how to print out the modified tree?
As iuliux points out, your problem is that you are treating your BTree as though it were a mutable structure. Remember functions in haskell take arguments and return a value. That is all. So when you "map over" a list, or traverse a tree your function needs to return a new tree.
The code you have is traversing the recursive tree and only returning the last leaf. Imagine for now that the leaf at the end of the path will always be ND. This is what you want:
add :: a -> Path -> Btree a -> Btree a
add da xs ND = Data da
add _ [] _ = error "You should make sure this doesn't happen or handle it"
add da (x:xs) (Branch st st2) =
case x of
L -> Branch (add da xs st) st2
R -> Branch st (add da xs st2)
Notice how in your original code you discard the Branch you pattern match against, when what you need to do is return it "behind you" as it were.
Now, on to the issue of handling situations where the leaf you arrive it is not a ND constructor:
This type of problem is common in functional programming. How can you return your recursive data structure "as you go" when the final result depends on a leaf far down the tree?
One solution for the trickiest of cases is the Zipper, which is a data structure that lets you go up down and sideways as you please. For your case that would be overkill.
I would suggest you change your function to the following:
add :: a -> Path -> Btree a -> Maybe (Btree a)
which means at each level you must return a Maybe (Btree a). Then use the Functor instance of Maybe in your recursive calls. Notice:
fmap (+1) (Just 2) == Just 3
fmap (+1) (Nothing) == Nothing
You should try to puzzle out the implementation for yourself!
I'm no expert in Haskell, but functional programming only works with functions. So kind of anything is a function.
Now, your function takes some input and returns something, not modifing the input. You have to retain the returned tree somewhere and that will be your new tree, the one with inserted element in it
We really need to see the Path and Error data types to answer your question, but you can print out your trees using the IO Monad:
main :: IO()
main = do let b = Branch ND (Branch (Data 1) (Data 2))
let b1 = add 10 [L] b --actual call depends on definition of Path
(putStrLn . show) b1

Resources