Haskell AST Annotation with Fix

Haskell AST Annotation with Fix - haskell

I am working on creating an AST in Haskell. I want to add different annotations, such as types and location information, so I ended up using fixplate. However, I can't find any examples online and am having some difficulty.
I've set up my AST as recommended by fixplate (some striped out):
data ProgramF a
= Unary a
Operator
| Number Int
| Let { bindings :: [(Identifier, a)]
, body :: a }
type Program = Mu ProgramF
Next to add a label I created another type, and a function to add labels based on a tree traversal.
type LabelProgram = Attr ProgramF PLabel
labelProgram :: Program -> LabelProgram
labelProgram =
annMap (PLabel . show . fst) . (snd . synthAccumL (\i x -> (i + 1, (i, x))) 0)
However, beyond this I am running into some issues. For example, I am trying to write a function that does some transformation on the AST. Because it requires a label to function, I've made the type LabelProgram -> Program, but I think I am doing something wrong here. Below is a snippet of part of the function (one of the simpler parts):
toANF :: LabelProgram -> Program
toANF (Fix (Ann label (Let {bindings, body}))) = Fix $ Let bindingANF nbody
where
bindingANF = map (\(i, e) -> (i, toANF e)) bindings
nbody = toANF body
I feel like I am working at the wrong level of abstraction here. Should I be explicitly matching against Fix Ann ... and returning Fix ... like this, or am I utilizing fixplate wrong?
Additionally, I am concerned about how to generalize functions. How can I make my functions work for Programs, LabelPrograms, and TypePrograms generically?

Edit: Add an example of a function for ProgramFs with generic annotations.
Yes, at least in the case of toANF, you're using it wrong.
In toANF, note that your Let bindingANF nbody and the companion definitions of bindingANF and nbody are just a reimplementation of fmap toANF for the specific constructor Let.
That is, if you derive a Functor instance for your ProgramF, then you can re-write your toANF snippet as:
toANF :: LabelProgram -> Program
toANF (Fix (Ann label l#(Let _ _))) = Fix (fmap toANF l)
If toANF is just stripping labels, then this definition works for all constructors and not just Let, so you can drop the pattern:
toANF :: LabelProgram -> Program
toANF (Fix (Ann label l)) = Fix (fmap toANF l)
and now, as per #Regis_Kuckaertz's comment, you've just re-implemented forget which is defined as:
forget = Fix . fmap forget . unAnn . unFix
With respect to writing functions that work generically on Program, LabelProgram, etc., I think it makes more sense to write functions generic in a (single) annotation:
foo :: Attr ProgramF a -> Attr ProgramF a
and, if you really need to apply them to the unannotated program, define:
type ProgramU = Attr ProgramF ()
where the "U" in ProgramU stands for "unit". Obviously, you can easily write translators to work with Programs as ProgramUs if really needed:
toU :: Functor f => Mu f -> Attr f ()
toU = synthetise (const ())
fromU :: Functor f => Attr f () -> Mu f
fromU = forget
mapU :: (Functor f) => (Attr f () -> Attr f ()) -> Mu f -> Mu f
mapU f = fromU . f . toU
foo' :: Mu ProgramF -> Mu ProgramF
foo' = mapU foo
As a concrete -- if stupid -- example, here's a function that separates Lets with multiple bindings into nested Lets with singleton bindings (and so breaks mutually recursive bindings in the Program language). It assumes that the annotation on a multi-binding Let will be copied to each of the resulting singleton Lets:
splitBindings :: Attr ProgramF a -> Attr ProgramF a
splitBindings (Fix (Ann a (Let (x:y:xs) e)))
= Fix (Ann a (Let [x] (splitBindings (Fix (Ann a (Let (y:xs) e))))))
splitBindings (Fix e) = Fix (fmap splitBindings e)
It can be applied to an example Program:
testprog :: Program
testprog = Fix $ Unary (Fix $ Let [(Identifier "x", Fix $ Number 1),
(Identifier "y", Fix $ Number 2)]
(Fix $ Unary (Fix $ Number 3) NegOp))
NegOp
like so:
> mapU splitBindings testprog
Fix (Unary (Fix (Let {bindings = [(Identifier "x",Fix (Number 1))],
body = Fix (Let {bindings = [(Identifier "y",Fix (Number 2))],
body = Fix (Unary (Fix (Number 3)) NegOp)})})) NegOp)
>
Here's my full working example:
{-# LANGUAGE DeriveFunctor #-}
{-# OPTIONS_GHC -Wall #-}
import Data.Generics.Fixplate
data Identifier = Identifier String deriving (Show)
data PLabel = PLabel deriving (Show)
data Operator = NegOp deriving (Show)
data ProgramF a
= Unary a
Operator
| Number Int
| Let { bindings :: [(Identifier, a)]
, body :: a }
deriving (Show, Functor)
instance ShowF ProgramF where showsPrecF = showsPrec
type Program = Mu ProgramF
type LabelProgram = Attr ProgramF PLabel
splitBindings :: Attr ProgramF a -> Attr ProgramF a
splitBindings (Fix (Ann a (Let (x:y:xs) e)))
= Fix (Ann a (Let [x] (splitBindings (Fix (Ann a (Let (y:xs) e))))))
splitBindings (Fix e) = Fix (fmap splitBindings e)
toU :: Functor f => Mu f -> Attr f ()
toU = synthetise (const ())
fromU :: Functor f => Attr f () -> Mu f
fromU = forget
mapU :: (Functor f) => (Attr f () -> Attr f ()) -> Mu f -> Mu f
mapU f = fromU . f . toU
testprog :: Program
testprog = Fix $ Unary (Fix $ Let [(Identifier "x", Fix $ Number 1),
(Identifier "y", Fix $ Number 2)]
(Fix $ Unary (Fix $ Number 3) NegOp))
NegOp
main :: IO ()
main = print $ mapU splitBindings testprog

Related

Defining a functor instance for this datatype

I'm attempting to define a functor instance for datatype Terms, which looks like:
data Terms a = Var a | App (Terms a) (Terms a) | Abs (Terms (Maybe a))
| Const Integer | Add (Terms a) (Terms a)
| IfZero (Terms a) (Terms a) (Terms a) | Y
deriving (Show,Eq)
However, I'm having trouble with the case for Abs.
My definition currently looks like this:
instance Functor Terms where
fmap f (Var x) = Var (f x)
fmap f (App t1 t2) = App (fmap f t1) (fmap f t2)
fmap f (Abs t) = Abs (fmap f t)
fmap f (Const n) = Const n
fmap f (Add t1 t2) = Add (fmap f t1) (fmap f t2)
fmap f (IfZero t1 t2 t3) = IfZero (fmap f t1) (fmap f t2) (fmap f t3)
fmap f Y = Y
I've tried several different things to get around the Maybe in order to apply the function to the term, but there's just something I'm not quite getting. I know the Maybe type is a functor on its own which means that fmap should work on it automatically, I'm just not sure how to work around the fact that it can't return as a Maybe.

The error comes from this line:
fmap f (Abs t) = Abs (fmap f t)
Abs contains a Maybe (Term a), and you want to get a Maybe (Term b), but f is of type a -> b. So, when you try to fmap it, you're passing a Term a to a function that takes in a. Clearly this doesn't work. Instead, make a Term a -> Term b with fmap f, then fmap that (creating a double fmap):
fmap f (Abs t) = Abs (fmap (fmap f) t)

Yes Maybe is a functor on its own and thus it has its own fmap:
fmap f (Abs t) = Abs (fmap₁ (fmap₂ f) t)
-- t :: Terms (Maybe a)
-- f :: a -> b
-- fmap₂ f :: Maybe a -> Maybe b
-- fmap₁ (fmap₂ f) :: Terms (Maybe a) -> Terms (Maybe b)
The subscript indices are not part of the code and are there for illustrative purposes only. Or we could use TypeApplications to distinguish them:
*Main> :set -XTypeApplications
*Main> fmap #Maybe (+1) $ Just 7
Just 8
*Main> fmap #Terms (+1) $ Abs $ Var $ Just 8
Abs (Var (Just 9))
but there's no need for them here and plain fmaps work as well -- Haskell knows which to apply, according to the argument's type:
*Main> fmap (+1) $ Just 7
Just 8
*Main> fmap (+1) $ Abs $ Var $ Just 8
Abs (Var (Just 9))

Another option is to move the composition structure into a datatype by turning Terms (Maybe a) into Scoped Terms a. Your Functor Terms instance stays the same.
Instead composition happens in Functor (Scoped exp), derived via the composition of exp and Maybe (Compose exp Maybe). This is how fmap = fmap . fmap is derived (modulo newtype wrapping).
{-# Language DerivingVia #-}
type Scoped :: (Type -> Type) -> (Type -> Type)
newtype Scoped exp a = Scoped (exp (Maybe a))
deriving (Functor, Applicative, Alternative, Foldable, Arbitrary1, ..)
via Compose exp Maybe
This is the approach taken by Edward Kmett's bound library.
Bound.Scope.Scope:
type Scope :: Type -> (Type -> Type) -> (Type -> Type)
newtype Scope bound exp free = Scope (exp (Var bound (exp free)))
Bound.Scope.Simple.Scope:
type Scope :: Type -> (Type -> Type) -> (Type -> Type)
newtype Scope bound exp free = Scope (exp (Var bound free))
where Var = Either:
type Var :: Type -> Type -> Type
data Var bound free
= B bound -- this is a bound variable
| F free -- this is a free variable

Is it possible to compare two trees with recursion schemes?

I have this AST
data ExprF r = Const Int | Add r r
type Expr = Fix ExprF
and I want to compare
x = Fix $ Add (Fix (Const 1)) (Fix (Const 1))
y = Fix $ Add (Fix (Const 1)) (Fix (Const 2))
But all recursion schemes functions seems to work only with single structure
Obviously I can use recursion
eq (Fix (Const x)) (Fix (Const y)) = x == y
eq (Fix (Add x1 y1)) (Fix (Add x2 y2)) = (eq x1 x2) && (eq y1 y2)
eq _ _ = False
But I hope it is possible to use some sort of zipfold function.

Recursion schemes that act on a single argument are enough, because we can return a function from a scheme application. In this case, we can return an Expr -> Bool function from a scheme application on Expr. For efficient equality checking we only need paramorphisms:
{-# language DeriveFunctor, LambdaCase #-}
newtype Fix f = Fix (f (Fix f))
data ExprF r = Const Int | Add r r deriving (Functor, Show)
type Expr = Fix ExprF
cata :: Functor f => (f a -> a) -> Fix f -> a
cata f = go where go (Fix ff) = f (go <$> ff)
para :: Functor f => (f (Fix f, a) -> a) -> Fix f -> a
para f (Fix ff) = f ((\x -> (x, para f x)) <$> ff)
eqExpr :: Expr -> Expr -> Bool
eqExpr = cata $ \case
Const i -> cata $ \case
Const i' -> i == i'
_ -> False
Add a b -> para $ \case
Add a' b' -> a (fst a') && b (fst b')
_ -> False
Of course, cata is trivially implementable in terms of para:
cata' :: Functor f => (f a -> a) -> Fix f -> a
cata' f = para (\ffa -> f (snd <$> ffa)
Technically, almost all useful functions are implementable using cata, but they aren't necessarily efficient. We can implement para using cata:
para' :: Functor f => (f (Fix f, a) -> a) -> Fix f -> a
para' f = snd . cata (\ffa -> (Fix (fst <$> ffa) , f ffa))
However, if we use para' in eqExpr we get quadratic complexity, since para' is always linear in the size of the input, while we can use para to peek at the topmost Expr values in constant time.

(This response uses the data-fix library because I couldn't get recursion-schemes to compile.)
We can model the diff of two trees as an anamorphism or unfolding of a "diff functor" that is based on the original functor.
Consider the following types
data DiffF func r = Diff (Fix func) (Fix func)
| Nodiff (func r)
deriving (Functor)
type ExprDiff = Fix (DiffF ExprF)
The idea is that ExprDiff will follow the "common structure" of the original Expr trees as long as it remains equal, but at the moment a difference is encountered, we switch to the Diff leaf, that stores the two subtrees that we found to be different.
The actual comparison function would be:
diffExpr :: Expr -> Expr -> ExprDiff
diffExpr e1 e2 = ana comparison (e1,e2)
where
comparison :: (Expr,Expr) -> DiffF ExprF (Expr,Expr)
comparison (Fix (Const i),Fix (Const i')) | i == i' =
Nodiff (Const i')
comparison (Fix (Add a1 a2),Fix (Add a1' a2')) =
Nodiff (Add (a1,a1') (a2,a2'))
comparison (something, otherthing) =
Diff something otherthing
The "seed" of the anamorphism is the pair of expressions we want to compare.
If we simply want a predicate Expr -> Expr -> Bool we can later use a catamorphism that detects the presence of Diff branches.

How to work with AST with Cofree annotation?

I have this simple Expr AST and I can easily convert it to String.
import Prelude hiding (Foldable)
import qualified Prelude
import Data.Foldable as F
import Data.Functor.Foldable
import Data.Monoid
import Control.Comonad.Cofree
data ExprF r = Const Int
| Add r r
deriving ( Show, Eq, Ord, Functor, Prelude.Foldable )
type Expr = Fix ExprF
testExpr = Fix $ Add (Fix (Const 1)) (Fix (Const 2))
convertToString :: Expr -> String
convertToString = cata $ \case
e#(Const x) -> show x
e#(Add x y) -> unwords [x, "+", y]
Now I want to add an additional data to it.
So I am trying to use Cofree
type LineNumber = Int
type Expr2 = Cofree ExprF LineNumber
I can convert Expr to Expr2
addLineNumbers :: Expr -> Expr2
addLineNumbers = cata $ \case
e#(Const _) -> 1 :< e
e -> 2 :< e
But I cannot figure out how to convert Expr2 to String
convertToString2 :: Expr2 -> String
convertToString2 = cata $ \case
e#(_ :< (Const x)) -> show x
e#(_ :< (Add x y)) -> unwords [x, "+", y]
Also, is Cofree the best way to solve this annotation problem?

An alternative way of annotating your syntax tree is to compose the annotation into the base functor.
-- constant functor
newtype K c a = K c
deriving (Eq, Ord, Show, Read, Functor, Foldable, Traversable)
-- functor product
data (f :*: g) a = (:*:) { left :: f a, right :: g a }
deriving (Eq, Ord, Show, Read, Functor, Foldable, Traversable)
We're going to use the functor product to attach an annotation (inside a K) to each layer of the tree.
type AnnExpr = Fix (K LineNumber :*: ExprF)
If you can generate annotations while only inspecting a single layer of the tree (that is, your annotation-generating code can be expressed as a natural transformation) then you can use the following bit of machinery to modify the functor while keeping the fixpoint structure in place:
hoistFix :: Functor f => (forall a. f a -> g a) -> Fix f -> Fix g
hoistFix f = Fix . f . fmap (hoistFix f) . unFix
This is of limited usefulness, though, as most interesting annotations such as type-checking require traversal of the syntax tree.
You can reuse the code to tear down an Expr by simply ignoring the annotations. Given an algebra for ExprF...
-- instructions for a stack machine
data Inst = PUSH Int | ADD
type Prog = [Inst]
compile_ :: ExprF Prog -> Prog
compile_ (Const x) = [PUSH x]
compile_ (Add x y) = x ++ y ++ [ADD]
... you can use it to tear down either an Expr or an AnnExpr:
compileE :: Expr -> Prog
compileE = cata compile_
compileA :: AnnExpr -> Prog
compileA = cata (compile_ . right)

Chaining values with catamorphisms

Suppose I have definitions as follows (where cata is the catamorphism):
type Algebra f a = f a -> a
newtype Fix f = Fx (f (Fix f))
unFix :: Fix f -> f (Fix f)
unFix (Fx x) = x
cata :: Functor f => (f a -> a) -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix
I was wondering if there would be some way to modify the definition of cata so that I could chain some object such as an int through it such that I could generate unique handles for things within the alg function, i.e. "a0", "a1", "a2", ..., etc.
Edit: To make this more clear, I'd like to be able to have some function cata' such that when I have something similar to the following definitions
data IntF a
= Const Int
| Add a a
instance Functor IntF where
fmap eval (Const i) = Const i
fmap eval (x `Add` y) = eval x `Add` eval y
alg :: Int -> Algebra IntF String
alg n (Const i) = "a" ++ show n
alg n (s1 `Add` s2) = s1 ++ " && " ++ s2
eval = cata' alg
addExpr = Fx $ (Fx $ Const 5) `Add` (Fx $ Const 4)
run = eval addExpr
then run evaluates to "a0 && a1" or something similar, i.e. the two constants don't get labeled the same thing.

Just sequence them as monads.
newtype Ctr a = Ctr { runCtr :: Int -> (a, Int) } -- is State Int
instance Functor Ctr
instance Applicative Ctr
instance Monad Ctr
type MAlgebra m f a = f (m a) -> m a
fresh :: Ctr Int
fresh = Ctr (\i -> (i, i+1))
data IntF a
= Val
| Add a a
malg :: IntF (Ctr String) -> Ctr String
malg Val = (\x -> "a" ++ show x) <$> fresh
malg (Add x y) = (\a b -> a ++ " && " ++ b) <$> x <*> y
go = cata malg

As I understand, you want something like
cata' :: Functor f => (Int -> f a -> a) -> Fix f -> a
so that you can operate both on f a and it's index.
If that's true, here's a possible solution.
Associated Int
First we define a new type which will represent Int-labelled functor:
{-# LANGUAGE DeriveFunctor #-}
data IntLabel f a = IntLabel Int (f a) deriving (Functor)
-- This acts pretty much like `zip`.
labelFix :: Functor f => [Int] -> Fix f -> Fix (IntLabel f)
labelFix (x:xs) (Fx f) = Fx . IntLabel x $ fmap (labelFix xs) f
Now we can define cata' using cata and labelFix:
cata' :: Functor f => (Int -> f a -> a) -> Fix f -> a
cata' alg = cata alg' . labelFix [1..]
where
alg' (IntLabel n f) = alg n f
NOTE: unique Ints are assigned to each layer, not each functor. E.g. for Fix [] each sublist of the outermost list will be labelled with 2.
Threading effects
A different approach to the problem would be to use cata to produce monadic value:
cata :: Functor f => (f (m a) -> m a) -> Fix f -> m a
This is just a specialized version of cata. With it we can define (almost) cat' as
cata'' :: Traversable f => (Int -> f a -> a) -> Fix f -> a
cata'' alg = flip evalState [1..] . cata alg'
where
alg' f = alg <$> newLabel <*> sequenceA f
newLabel :: State [a] a
newLabel = state (\(x:xs) -> (x, xs))
Note that Traversable instance now is needed in order to switch f (m a) to m (f a).
However, you might want to use just a bit more specialized cata:
cata :: (Functor f, MonadReader Int m) => (f (m a) -> m a) -> Fix f -> m a

freezing haskell STrefs

I would like to implement a Doubly Connected Edge List data structure for use in Haskell. This data structure is used to manage the topology of an arrangement of lines in a plane, and contains structures for faces, edges, and vertices.
It seems to me like a good interface to this data structure would be as a type Arrangement, with functions like
overlay :: Arrangement -> Arrangement -> Arrangement
but the usual implementation relies heavily on references (for example each face has references to the adjacent edges).
It seems to me that the ideal way for this to work would be similar to the way mutable and immutable arrays do: the internals of the Arrangement data structure are implemented as functional data structures, but the operations that mutate arrangements "unfreeze" them to create new mutable instances within a monad (ideally using COW magic to make things efficient).
So my questions are:
(1) is there a way to freeze and unfreeze a small heap like there is for arrays?
(2) if not, is there a better approach?

This might be what you are looking for. Loops should work fine. A simple example involving a loop appears first.
data List a t = Nil | Cons a t deriving (Show, Functor, Foldable, Traversable)
runTerm $ do
x <- newVar Nil
writeVar x (Cons 'a' (Var x)))
return $ Var x
And now, the code.
{-# LANGUAGE
Rank2Types
, StandaloneDeriving
, UndecidableInstances #-}
module Freeze
( Term (..)
, Fix (..)
, runTerm
, freeze
, thaw
, Var
, newVar
, writeVar
, readVar
, modifyVar
, modifyVar'
) where
import Control.Applicative
import Control.Monad.Fix
import Control.Monad.Trans.Class
import Control.Monad.Trans.Reader
import Control.Monad.ST
import Data.STRef
import Data.Traversable (Traversable, traverse)
data Term s f
= Var {-# UNPACK #-} !(Var s f)
| Val !(f (Term s f))
newtype Fix f = Fix { getFix :: f (Fix f) }
deriving instance Show (f (Fix f)) => Show (Fix f)
runTerm :: Traversable f => (forall s . ST s (Term s f)) -> Fix f
runTerm m = runST $ m >>= freeze
freeze :: Traversable f => Term s f -> ST s (Fix f)
freeze t = do
xs <- newSTRef Nil
f <- runReaderT (loop t) xs
readSTRef xs >>= mapM_' modifyToOnly
return f
where
loop (Val f) = Fix <$> traverse loop f
loop (Var (STRef ref)) = do
a <- lift $ readSTRef ref
case a of
Both _ f' ->
return f'
Only f -> mfix $ \ f' -> do
lift $ writeSTRef ref $! Both f f'
ask >>= lift . flip modifySTRef' (ref :|)
Fix <$> traverse loop f
thaw :: Traversable f => Fix f -> ST s (Term s f)
thaw = return . loop
where
loop = Val . fmap loop . getFix
newtype Var s f = STRef (STRef s (Many s f))
newVar :: f (Term s f) -> ST s (Var s f)
newVar = fmap STRef . newSTRef . Only
readVar :: Var s f -> ST s (f (Term s f))
readVar (STRef ref) = fst' <$> readSTRef ref
writeVar :: Var s f -> f (Term s f) -> ST s ()
writeVar (STRef ref) a = writeSTRef ref $! Only a
modifyVar :: Var s f -> (f (Term s f) -> f (Term s f)) -> ST s ()
modifyVar (STRef ref) f = modifySTRef' ref (Only . f . fst')
modifyVar' :: Var s f -> (f (Term s f) -> f (Term s f)) -> ST s ()
modifyVar' (STRef ref) f = modifySTRef' ref (\ a -> Only $! f (fst' a))
data Many s f
= Only (f (Term s f))
| Both (f (Term s f)) (Fix f)
fst' :: Many s f -> f (Term s f)
fst' (Only a) = a
fst' (Both a _) = a
modifyToOnly :: STRef s (Many s f) -> ST s ()
modifyToOnly ref = do
a <- readSTRef ref
case a of
Only _ -> return ()
Both f _ -> writeSTRef ref $! Only f
data List s a = Nil | {-# UNPACK #-} !(STRef s a) :| !(List s a)
mapM_' :: Monad m => (STRef s a -> m b) -> List s a -> m ()
mapM_' _ Nil = return ()
mapM_' k (x :| xs) = k x >> mapM_' k xs

Not that the safe versions of freeze and thaw make complete copies of the array, so aren't necessarily that efficient. Of course, making a complete copy of an array of refs is arguably an optimization over making a complete copy of a structure through walking it and recursively pulling things ou of MVars, etc.
Another approach to take would be something similar to that of Repa -- represent operations over your structure algebraically, and write a run function that optimizes, fuses, and then executes all in one pass. Arguably this is a more functional design. (You can use unsafe operations under the covers even, to make reification happen on-demand rather than explicitly).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell AST Annotation with Fix - haskell

Related

Defining a functor instance for this datatype

Is it possible to compare two trees with recursion schemes?

How to work with AST with Cofree annotation?

Chaining values with catamorphisms

freezing haskell STrefs

Categories

Resources