Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
so I have to build a simple compiler for a simple language, I used Haskell's Alex and Happy to build the parser, and it is printing the right AST's already, so next step I haveto do is to translate that data structure to another one, that represents the program in Three Address Code.
I'm yet a bit lost on how to do this, so, using Haskell data structure, how can I translate an AST to it's three Address Code? would really appreciate some help :)
thanks in advance !
Your parser is quite irrelevant to this question - which seems to be, how do I translate one AST to another. To address this I will use a simplified language. Also, the code below is not intended to be simple, but easily extensible and maintainable.
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.State
import Control.Monad.Reader
import Control.Monad.Free
import Data.Functor.Foldable
import qualified Data.Set as S
data Ident = Ident Int deriving (Eq)
data ExpF a = IntLitF Int
| PlusF a a
| IntVarF Ident deriving (Eq, Functor)
data Exp = IntLit Int
| Plus Exp Exp
| IntVar Ident deriving (Eq)
data Cmd
= CmdAtrib Ident Exp
| CmdSeq Cmd Cmd
| CmdNone deriving (Eq)
data TAC_F r = Assign Ident (ExpF Ident) r deriving (Eq, Show, Functor)
type TAC = Free TAC_F ()
(=:) :: Ident -> ExpF Ident -> TAC
(=:) i e = Free (Assign i e (Pure ()))
Some of the above definitions might seem strange. ExpF and Exp are defined in the recursion schemes style and used the recursion-schemes package. More info. TAC is defined in terms of Free because like the name implies, you get monad syntax for free. The only thing that a >> b does for TAC is creates the ast which contains a followed by b.
You need a way to generate fresh variables:
freshVar :: Monad m => StateT [Ident] m Ident
freshVar = do
s <- get
case s of
[] -> put [Ident (-1)] >> return (Ident (-1))
(Ident x:xs) -> put (Ident (x-1) : xs) >> return (Ident (x-1))
I use a list because it is simple but you may want to attach more information to your identifiers, in which case you should use Data.Set.Set or Data.Map.Map. By convention, fresh variables are negative while quantified variables are positive. Not a very sophisticted method, but it works.
Now this is where the magic happens. Thanks to recursion schemes, recursion over the tree is very simple:
translateExp :: Exp -> State [Ident] (TAC, Ident)
translateExp = cata go where
go (PlusF a b) = do
(ae,av) <- a
(be,bv) <- b
t <- freshVar
return (ae >> be >> t =: PlusF av bv, t)
go (IntLitF i) = do
t <- freshVar
return (t =: IntLitF i, t)
go (IntVarF a) = return (return (), a)
translateCmd :: Cmd -> State [Ident] TAC
translateCmd (CmdAtrib ident exp) = do
(e,v) <- translateExp exp
return (e >> ident =: IntVarF v)
translateCmd (CmdSeq a b) = do
x <- translateCmd a
y <- translateCmd b
return (x >> y)
translateCmd CmdNone = return (return ())
Then an example:
test0 = CmdSeq (CmdAtrib (Ident 1) (IntLit 10 `Plus` IntVar (Ident 2)))
(CmdAtrib (Ident 3) (IntVar (Ident 1) `Plus` IntVar (Ident 1) `Plus` IntVar (Ident 2)))
>putStrLn $ showTAC $ fst $ runState (translateCmd test0) []
t1 =: 10
t2 =: t1 + v2
v1 =: t2
t3 =: v1 + v1
t4 =: t3 + v2
v3 =: t4
Note that variables bound by the LHS of CmdAtrib will never collide with those found in the RHS.
Boilerplate / show instances:
instance Show Ident where
show (Ident i) | i < 0 = "t" ++ show (abs i)
| otherwise = "v" ++ show i
instance Show a => Show (ExpF a) where
show (IntLitF i) = show i
show (PlusF a b) = show a ++ " + " ++ show b
show (IntVarF i) = show i
type instance Base Exp = ExpF
instance Foldable Exp where
project (IntLit i) = IntLitF i
project (Plus a b) = PlusF a b
project (IntVar b) = IntVarF b
instance Show Cmd where
show (CmdAtrib i e) = show i ++ " <- " ++ show e
show (CmdSeq a b) = show a ++ " ;\n " ++ show b
show (CmdNone) = ""
instance Show Exp where
show (IntLit i) = show i
show (Plus a b) = show a ++ " + " ++ show b
show (IntVar i) = show i
showTAC (Free (Assign i exp xs)) = show i ++ " =: " ++ show exp ++ "\n" ++ showTAC xs
showTAC (Pure a) = ""
Related
I made the pattern where if you give it only a number it's gonna return the Value. I would just add deriving (Show) to Val as well but that doesn't work because of (Val->Val) (that's what I understood from the error messages). Anyone know what I could do ?
import GHC.Show (Show)
type Var = String
-- Expressions of source code in the form of a Abstract syntax tree
data Exp = Enum Int -- constant
|Evar Var -- variable
|Elet Var Exp Exp -- expr "let x = e1 in e2"
|Ecall Exp Exp -- Function call
deriving (Show)
-- returned values
data Val = Vnum Int -- Whole number
|Vprim (Val->Val) -- A primitive
mkPrim::(Int->Int->Int)->Val
mkPrim f = Vprim(\(Vnum x) -> Vprim (\(Vnum y) -> Vnum (f x y)))
-- Initial environement that contains all primitives
type Env = [(Var, Val)]
pervasive::Env
pervasive = [("+", mkPrim (+)), ("-", mkPrim (-)),("*", mkPrim (*)), ("/", mkPrim div)]
eval::Env->Exp->Val
eval pervasive (Enum n) = Vnum n
sampleExp = Elet "x" (Enum 3) (Ecall (Ecall (Evar "+") (Evar "x")) (Enum 4))
main = do print(eval pervasive (Enum 4))
Consider implementing a custom Show instance:
instance Show Val where
show (Vnum n) = show n
show (Vprim _) = "<prim>"
This however shall make printed primitives indistinguishable. It might be convenient to equip prims with names:
data Val = Vnum Int | Vprim String (Val -> Val)
instance Show Val where
show (Vnum n) = show n
show (Vprim name _) = name
When I want to read string to type A I write read str::A. Consider, I want to have generic function which can read string to different types, so I want to write something like read str::A|||B|||C or something similar. The only thing I could think of is:
{-# LANGUAGE TypeOperators #-}
infixr 9 |||
data a ||| b = A a|B b deriving Show
-- OR THIS:
-- data a ||| b = N | A a (a ||| b) | B b (a ||| b) deriving (Data, Show)
instance (Read a, Read b) => Read (a ||| b) where
readPrec = parens $ do
a <- (A <$> readPrec) <|> (B <$> readPrec)
-- OR:
-- a <- (flip A N <$> readPrec) <|> (flip B N <$> readPrec)
return a
And if I want to read something:
> read "'a'"::Int|||Char|||String
B (A 'a')
But what to do with such weird type? I want to fold it to Int or to Char or to String... Or to something another but "atomic" (scalar/simple). Final goal is to read strings like "1,'a'" to list-like [D 1, D 'a']. And main constraint here is that structure is flexible, so string can be "1, 'a'" or "'a', 1" or "\"xxx\", 1, 2, 'a'". I know how to read something separated with delimiter, but this something should be passed as type, not as sum of types like C Char|I Int|S String|etc. Is it possible? Or no way to accomplish it without sum of types?
There’s no way to do this in general using read, because the same input string might parse correctly to more than one of the valid types. You could, however, do this with a function like Text.Read.readMaybe, which returns Nothing on ambiguous input. You might also return a tuple or list of the valid interpretations, or have a rule for which order to attempt to parse the types in, such as: attempt to parse each type in the order they were declared.
Here’s some example code, as proof of concept:
import Data.Maybe (catMaybes, fromJust, isJust, isNothing)
import qualified Text.Read
data AnyOf3 a b c = FirstOf3 a | SecondOf3 b | ThirdOf3 c
instance (Show a, Show b, Show c) => Show (AnyOf3 a b c) where
show (FirstOf3 x) = show x -- Can infer the type from the pattern guard.
show (SecondOf3 x) = show x
show (ThirdOf3 x) = show x
main :: IO ()
main =
(putStrLn . unwords . map show . catMaybes . map readDBS)
["True", "2", "\"foo\"", "bar"] >>
(putStrLn . unwords . map show . readIID) "100"
readMaybe' :: (Read a, Read b, Read c) => String -> Maybe (AnyOf3 a b c)
-- Based on the function from Text.Read
readMaybe' x | isJust a && isNothing b && isNothing c =
(Just . FirstOf3 . fromJust) a -- Can infer the type of a from this.
| isNothing a && isJust b && isNothing c =
(Just . SecondOf3 . fromJust) b -- Can infer the type of b from this.
| isNothing a && isNothing b && isJust c =
(Just . ThirdOf3 . fromJust) c -- Can infer the type of c from this.
| otherwise = Nothing
where a = Text.Read.readMaybe x
b = Text.Read.readMaybe x
c = Text.Read.readMaybe x
readDBS :: String -> Maybe (AnyOf3 Double Bool String)
readDBS = readMaybe'
readToList :: (Read a, Read b, Read c) => String -> [AnyOf3 a b c]
readToList x = repack FirstOf3 x ++ repack SecondOf3 x ++ repack ThirdOf3 x
where repack constructor y | isJust z = [(constructor . fromJust) z]
| otherwise = []
where z = Text.Read.readMaybe y
readIID :: String -> [AnyOf3 Int Integer Double]
readIID = readToList
The first output line echoes every input that parsed successfully, that is, the Boolean constant, the number and the quoted string, but not bar. The second output line echoes every possible interpretation of the input, that is, 100 as an Int, an Integer and a Double.
For something more complicated, you want to write a parser. Haskell has some very good libraries to build them out of combinators. You might look at one such as Parsec. But it’s still helpful to understand what goes on under the hood.
I'm trying to use CPS to simplify control-flow implementation in my Python interpreter. Specifically, when implementing return/break/continue, I have to store state and unwind manually, which is tedious. I've read that it's extraordinarily tricky to implement exception handling in this way. What I want is for each eval function to be able to direct control flow to either the next instruction, or to a different instruction entirely.
Some people with more experience than me suggested looking into CPS as a way to deal with this properly. I really like how it simplifies control flow in the interpreter, but I'm not sure how much I need to actually do in order to accomplish this.
Do I need to run a CPS transform on the AST? Should I lower this AST into a lower-level IR that is smaller and then transform that?
Do I need to update the evaluator to accept the success continuation everywhere? (I'm assuming so).
I think I generally understand the CPS transform: the goal is to thread the continuation through the entire AST, including all expressions.
I'm also a bit confused where the Cont monad fits in here, as the host language is Haskell.
Edit: here's a condensed version of the AST in question. It is a 1-1 mapping of Python statements, expressions, and built-in values.
data Statement
= Assignment Expression Expression
| Expression Expression
| Break
| While Expression [Statement]
data Expression
| Attribute Expression String
| Constant Value
data Value
= String String
| Int Integer
| None
To evaluate statements, I use eval:
eval (Assignment (Variable var) expr) = do
value <- evalExpr expr
updateSymbol var value
eval (Expression e) = do
_ <- evalExpr e
return ()
To evaluate expressions, I use evalExpr:
evalExpr (Attribute target name) = do
receiver <- evalExpr target
attribute <- getAttr name receiver
case attribute of
Just v -> return v
Nothing -> fail $ "No attribute " ++ name
evalExpr (Constant c) = return c
What motivated the whole thing was the shenanigans required for implementing break. The break definition is reasonable, but what it does to the while definition is a bit much:
eval (Break) = do
env <- get
when (loopLevel env <= 0) (fail "Can only break in a loop!")
put env { flow = Breaking }
eval (While condition block) = do
setup
loop
cleanup
where
setup = do
env <- get
let level = loopLevel env
put env { loopLevel = level + 1 }
loop = do
env <- get
result <- evalExpr condition
when (isTruthy result && flow env == Next) $ do
evalBlock block
-- Pretty ugly! Eat continue.
updatedEnv <- get
when (flow updatedEnv == Continuing) $ put updatedEnv { flow = Next }
loop
cleanup = do
env <- get
let level = loopLevel env
put env { loopLevel = level - 1 }
case flow env of
Breaking -> put env { flow = Next }
Continuing -> put env { flow = Next }
_ -> return ()
I am sure there are more simplifications that can be done here, but the core problem is one of stuffing state somewhere and manually winding out. I'm hoping that CPS will let me stuff book-keeping (like loop exit points) into state and just use those when I need them.
I dislike the split between statements and expressions and worry it might make the CPS transform more work.
This finally gave me a good excuse to try using ContT!
Here's one possible way of doing this: store (in a Reader wrapped in ContT) the continuation of exiting the current (innermost) loop:
newtype M r a = M{ unM :: ContT r (ReaderT (M r ()) (StateT (Map Id Value) IO)) a }
deriving ( Functor, Applicative, Monad
, MonadReader (M r ()), MonadCont, MonadState (Map Id Value)
, MonadIO
)
runM :: M a a -> IO a
runM m = evalStateT (runReaderT (runContT (unM m) return) (error "not in a loop")) M.empty
withBreakHere :: M r () -> M r ()
withBreakHere act = callCC $ \break -> local (const $ break ()) act
break :: M r ()
break = join ask
(I've also added IO for easy printing in my toy interpreter, and State (Map Id Value) for variables).
Using this setup, you can write Break and While as:
eval Break = break
eval (While condition block) = withBreakHere $ fix $ \loop -> do
result <- evalExpr condition
unless (isTruthy result)
break
evalBlock block
loop
Here's the full code for reference:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Interp where
import Prelude hiding (break)
import Control.Applicative
import Control.Monad.Cont
import Control.Monad.State
import Control.Monad.Reader
import Data.Function
import Data.Map (Map)
import qualified Data.Map as M
import Data.Maybe
type Id = String
data Statement
= Print Expression
| Assign Id Expression
| Break
| While Expression [Statement]
| If Expression [Statement]
deriving Show
data Expression
= Var Id
| Constant Value
| Add Expression Expression
| Not Expression
deriving Show
data Value
= String String
| Int Integer
| None
deriving Show
data Env = Env{ loopLevel :: Int
, flow :: Flow
}
data Flow
= Breaking
| Continuing
| Next
deriving Eq
newtype M r a = M{ unM :: ContT r (ReaderT (M r ()) (StateT (Map Id Value) IO)) a }
deriving ( Functor, Applicative, Monad
, MonadReader (M r ()), MonadCont, MonadState (Map Id Value)
, MonadIO
)
runM :: M a a -> IO a
runM m = evalStateT (runReaderT (runContT (unM m) return) (error "not in a loop")) M.empty
withBreakHere :: M r () -> M r ()
withBreakHere act = callCC $ \break -> local (const $ break ()) act
break :: M r ()
break = join ask
evalExpr :: Expression -> M r Value
evalExpr (Constant val) = return val
evalExpr (Var v) = gets $ fromMaybe err . M.lookup v
where
err = error $ unwords ["Variable not in scope:", show v]
evalExpr (Add e1 e2) = do
Int val1 <- evalExpr e1
Int val2 <- evalExpr e2
return $ Int $ val1 + val2
evalExpr (Not e) = do
val <- evalExpr e
return $ if isTruthy val then None else Int 1
isTruthy (String s) = not $ null s
isTruthy (Int n) = n /= 0
isTruthy None = False
evalBlock = mapM_ eval
eval :: Statement -> M r ()
eval (Assign v e) = do
val <- evalExpr e
modify $ M.insert v val
eval (Print e) = do
val <- evalExpr e
liftIO $ print val
eval (If cond block) = do
val <- evalExpr cond
when (isTruthy val) $
evalBlock block
eval Break = break
eval (While condition block) = withBreakHere $ fix $ \loop -> do
result <- evalExpr condition
unless (isTruthy result)
break
evalBlock block
loop
and here's a neat test example:
prog = [ Assign "i" $ Constant $ Int 10
, While (Var "i") [ Print (Var "i")
, Assign "i" (Add (Var "i") (Constant $ Int (-1)))
, Assign "j" $ Constant $ Int 10
, While (Var "j") [ Print (Var "j")
, Assign "j" (Add (Var "j") (Constant $ Int (-1)))
, If (Not (Add (Var "j") (Constant $ Int (-4)))) [ Break ]
]
]
, Print $ Constant $ String "Done"
]
which is
i = 10
while i:
print i
i = i - 1
j = 10
while j:
print j
j = j - 1
if j == 4:
break
so it will print
10 10 9 8 7 6 5
9 10 9 8 7 6 5
8 10 9 8 7 6 5
...
1 10 9 8 7 6 5
How can one create LaTeX source for natural deduction proof trees (like those shown here) via Haskell eg using HaTeX? I'd like to emulate LaTeX .stys like bussproofs.sty or proof.sty.
I'm using your question as an excuse to improve and demo a Haskell
call-tracing library I'm working
on. In the context of
tracing, an obvious way to create a proof tree is to trace a type
checker and then format the trace as a natural-deduction proof. To
keep things simple my example logic is the simply-typed lambda
calculus
(STLC),
which corresponds to the implicational fragment of propositional
intuitionistic
logic.
I am using proofs.sty, but not via HaTeX or any other Haskell
Latex library. The Latex for proof trees is very simple and using a
Haskell Latex library would just complicate things.
I've written the proof-tree generation code twice:
in a self-contained way, by writing a type checker that also
returns a proof tree;
using my tracing library, by call-tracing a type checker and then
post processing the trace into a proof tree.
Since you didn't ask about call-tracing libraries, you may be less
interested in the call-trace based version, but I think it's
interesting to compare both versions.
Examples
Let's start with some output examples first, to see what all this gets us.
The first three examples are motivated
by an axiom system for implicational propositional
calculus;
the first two also happen to correspond to S and
K:
The first axiom, K, with proof terms:
The second axiom, S, with proof terms, but with premises in the
context, not lambda bound:
The fourth axiom, modus ponens, without proof terms:
The third axiom in that Wikipedia article (Peirce's law) is
non-constructive and so we can't prove it here.
For a different kind of example, here's a failed type check of the Y
combinator:
The arrows are meant to lead you to the error, which is marked with a
bang (!).
Code
Now I'll describe the code which generated those examples. The code is
from this
file
unless otherwise noted. I'm not including every line of code here;
see that link if you want something you can actually build using GHC
7.6.3.
Most of the code -- the grammar, parser, and pretty printer -- is the
same for both versions; only the type checkers and proof-tree
generators differ. All of the common code is in the file just
referenced.
STLC grammar
The STLC grammar in ASCII:
-- Terms
e ::= x | \x:T.e | e e
-- Types
T ::= A | T -> T
-- Contexts
C ::= . | C,x:T
And the corresponding Haskell:
type TmVar = String
type TyVar = String
data Tm = Lam TmVar Ty Tm
| TmVar TmVar
| Tm :#: Tm
deriving Show
data Ty = TyVar TyVar
| Ty :->: Ty
deriving (Eq , Show)
type Ctx = [(TmVar,Ty)]
Type checking + proof tree generation
Both versions implement the same abstract STLC type checker. In ASCII:
(x:T) in C
---------- Axiom
C |- x : T
C,x:T1 |- e : T2
--------------------- -> Introduction
C |- \x:T1.e : T1->T2
C |- e : T1 -> T2 C |- e1 : T1
--------------------------------- -> Elimination
C |- e e1 : T2
Version 1: self-contained with inline proof-tree generation
The full code for this version is
here.
The proof-tree generation happens in the type checker, but the actual
proof-tree generation code is factored out into addProof and
conclusion.
Type checking
-- The mode is 'True' if proof terms should be included.
data R = R { _ctx :: Ctx , _mode :: Bool }
type M a = Reader R a
extendCtx :: TmVar -> Ty -> M a -> M a
extendCtx x t = local extend where
extend r = r { _ctx = _ctx r ++ [(x,t)] }
-- These take the place of the inferred type when there is a type
-- error.
here , there :: String
here = "\\,!"
there = "\\,\\uparrow"
-- Return the inferred type---or error string if type inference
-- fails---and the latex proof-tree presentation of the inference.
--
-- This produces different output than 'infer' in the error case: here
-- all premises are always computed, whereas 'infer' stops at the
-- first failing premise.
inferProof :: Tm -> M (Either String Ty , String)
inferProof tm#(Lam x t e) = do
(et' , p) <- extendCtx x t . inferProof $ e
let et'' = (t :->:) <$> et'
addProof et'' [p] tm
inferProof tm#(TmVar x) = do
mt <- lookup x <$> asks _ctx
let et = maybe (Left here) Right mt
addProof et [] tm
inferProof tm#(e :#: e1) = do
(et , p) <- inferProof e
(et1 , p1) <- inferProof e1
case (et , et1) of
(Right t , Right t1) ->
case t of
t1' :->: t2 | t1' == t1 -> addProof (Right t2) [p , p1] tm
_ -> addProof (Left here) [p , p1] tm
_ -> addProof (Left there) [p , p1] tm
Proof tree generation
The addProof corresponds to proofTree in the other version:
-- Given the inferred type, the proof-trees for all premise inferences
-- (subcalls), and the input term, annotate the inferred type with a
-- result proof tree.
addProof :: Either String Ty -> [String] -> Tm -> M (Either String Ty , String)
addProof et premises tm = do
R { _mode , _ctx } <- ask
let (judgment , rule) = conclusion _mode _ctx tm et
let tex = "\\infer[ " ++ rule ++ " ]{ " ++
judgment ++ " }{ " ++
intercalate " & " premises ++ " }"
return (et , tex)
The code for conclusion is common to both versions:
conclusion :: Mode -> Ctx -> Tm -> Either String Ty -> (String , String)
conclusion mode ctx tm e = (judgment mode , rule tm)
where
rule (TmVar _) = "\\textsc{Axiom}"
rule (Lam {}) = "\\to \\text{I}"
rule (_ :#: _) = "\\to \\text{E}"
tyOrError = either id pp e
judgment True = pp ctx ++ " \\vdash " ++ pp tm ++ " : " ++ tyOrError
judgment False = ppCtxOnlyTypes ctx ++ " \\vdash " ++ tyOrError
Version 2: via call-tracing, with proof-tree generation as post processing
Here the type checker is not even aware of proof-tree generation, and
adding call-tracing is just one line.
Type checking
type Mode = Bool
type Stream = LogStream (ProofTree Mode)
type M a = ErrorT String (ReaderT Ctx (Writer Stream)) a
type InferTy = Tm -> M Ty
infer , infer' :: InferTy
infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'
infer' (TmVar x) = maybe err pure . lookup x =<< ask where
err = throwError $ "Variable " ++ x ++ " not in context!"
infer' (Lam x t e) = (t :->:) <$> (local (++ [(x,t)]) . infer $ e)
infer' (e :#: e1) = do
t <- infer e
t1 <- infer e1
case t of
t1' :->: t2 | t1' == t1 -> pure t2
_ -> throwError $ "Can't apply " ++ show t ++ " to " ++ show t1 ++ "!"
The LogStream
type
and ProofTree
class
are from the library. The LogStream is the type of log events that
the "magic"
simpleLogger
logs. Note the line
infer = simpleLogger (Proxy::Proxy "infer") ask (return ()) infer'
which defines infer to be a logged version of infer', the actual
type checker. That's all you have to do to trace a monadic function!
I won't get into how simpleLogger actually works here, but the
result is that each call to infer gets logged, including the
context, arguments, and return value, and these data get grouped
together with all logged subcalls (here only to infer). It would be
easy to manually write such logging code for infer, but it's nice
that with the library you don't have to.
Proof-tree generation
To generate the Latex proof trees, we implement ProofTree to post
process infer's call trace. The library provides a proofTree
function that calls the ProofTree methods and assembles the proof
trees; we just need to specify how the conclusions of the typing
judgments will be formatted:
instance ProofTree Mode (Proxy (SimpleCall "infer" Ctx InferTy ())) where
callAndReturn mode t = conclusion mode ctx tm (Right ty)
where
(tm , ()) = _arg t
ty = _ret t
ctx = _before t
callAndError mode t = conclusion mode ctx tm (Left error)
where
(tm , ()) = _arg' t
how = _how t
ctx = _before' t
error = maybe "\\,!" (const "\\,\\uparrow") how
The pp calls are to a user defined pretty printer; obviously, the
library can't know how to pretty print your data types.
Because calls can be erroneous -- the library detects errors
-- we have to say how to format
successful and failing calls. Refer to the Y-combinator example above
for an example a failing type check, corresponding to the
callAndError case here.
The library's proofTree
function
is quite simple: it builds a proofs.sty proof tree with the current
call as conclusion, and the subcalls as premises:
proofTree :: mode -> Ex2T (LogTree (ProofTree mode)) -> String
proofTree mode (Ex2T t#(CallAndReturn {})) =
"\\infer[ " ++ rule ++ " ]{ " ++ conclusion ++ " }{ " ++ intercalate " & " premises ++ " }"
where
(conclusion , rule) = callAndReturn mode t
premises = map (proofTree mode) (_children t)
proofTree mode (Ex2T t#(CallAndError {})) =
"\\infer[ " ++ rule ++ " ]{ " ++ conclusion ++ " }{ " ++ intercalate " & " premises ++ " }"
where
(conclusion , rule) = callAndError mode t
premises = map (proofTree mode)
(_children' t ++ maybe [] (:[]) (_how t))
I use proofs.sty in the library because it allows arbitrarily many
premises, although bussproofs.sty would work for this STLC example
since no rule has more than five premises (the limit for
bussproofs). Both Latex packages are described
here.
Pretty printing
Now we return to code that is common between both versions.
The pretty printer that defines the pp used above is rather long --
it handles precedence and associativity, and is written in a way that
should be extensible if more terms, e.g. products, are added to the
calculus -- but mostly straightforward. First, we set up a precedence
table and a precedence-and-associativity-aware parenthesizer:
- Precedence: higher value means tighter binding.
type Prec = Double
between :: Prec -> Prec -> Prec
between x y = (x + y) / 2
lowest , highest , precLam , precApp , precArr :: Prec
highest = 1
lowest = 0
precLam = lowest
precApp = between precLam highest
precArr = lowest
-- Associativity: left, none, or right.
data Assoc = L | N | R deriving Eq
-- Wrap a pretty print when the context dictates.
wrap :: Pretty a => Assoc -> a -> a -> String
wrap side ctx x = if prec x `comp` prec ctx
then pp x
else parens . pp $ x
where
comp = if side == assoc x || assoc x == N
then (>=)
else (>)
parens s = "(" ++ s ++ ")"
And then we define the individual pretty printers:
class Pretty t where
pp :: t -> String
prec :: t -> Prec
prec _ = highest
assoc :: t -> Assoc
assoc _ = N
instance Pretty Ty where
pp (TyVar v) = v
pp t#(t1 :->: t2) = wrap L t t1 ++ " {\\to} " ++ wrap R t t2
prec (_ :->: _) = precArr
prec _ = highest
assoc (_ :->: _) = R
assoc _ = N
instance Pretty Tm where
pp (TmVar v) = v
pp (Lam x t e) = "\\lambda " ++ x ++ " {:} " ++ pp t ++ " . " ++ pp e
pp e#(e1 :#: e2) = wrap L e e1 ++ " " ++ wrap R e e2
prec (Lam {}) = precLam
prec (_ :#: _) = precApp
prec _ = highest
assoc (_ :#: _) = L
assoc _ = N
instance Pretty Ctx where
pp [] = "\\cdot"
pp ctx#(_:_) =
intercalate " , " [ x ++ " {:} " ++ pp t | (x,t) <- ctx ]
By adding a "mode" argument, it would be easy to use the same pretty
printer to print plain ASCII, which would be useful with other
call-trace post processors, such as the (unfinished) UnixTree
processor.
Parsing
A parser is not essential to the example, but of course I did not enter the
example input terms directly as Haskell Tms.
Recall the STLC grammar in ASCII:
-- Terms
e ::= x | \x:T.e | e e
-- Types
T ::= A | T -> T
-- Contexts
C ::= . | C,x:T
This grammar is ambiguous: both the term application e e
and function type T -> T have no associativity given by the
grammar. But in STLC term application is left associative and function
types are right associative, and so the corresponding disambiguated
grammar we actually parse is
-- Terms
e ::= e' | \x:T.e | e e'
e' ::= x | ( e )
-- Types
T ::= T' | T' -> T
T' ::= A | ( T )
-- Contexts
C ::= . | C,x:T
The parser is maybe too simple -- I'm not using a languageDef and
it's whitespace sensitive -- but it gets the job done:
type P a = Parsec String () a
parens :: P a -> P a
parens = Text.Parsec.between (char '(') (char ')')
tmVar , tyVar :: P String
tmVar = (:[]) <$> lower
tyVar = (:[]) <$> upper
tyAtom , arrs , ty :: P Ty
tyAtom = parens ty
<|> TyVar <$> tyVar
arrs = chainr1 tyAtom arrOp where
arrOp = string "->" *> pure (:->:)
ty = arrs
tmAtom , apps , lam , tm :: P Tm
tmAtom = parens tm
<|> TmVar <$> tmVar
apps = chainl1 tmAtom appOp where
appOp = pure (:#:)
lam = uncurry Lam <$> (char '\\' *> typing)
<*> (char '.' *> tm)
tm = apps <|> lam
typing :: P (TmVar , Ty)
typing = (,) <$> tmVar
<*> (char ':' *> ty)
ctx :: P Ctx
ctx = typing `sepBy` (char ',')
To clarify what the input terms look like, here are the examples from
the Makefile:
# OUTFILE CONTEXT TERM
./tm2latex.sh S.ctx 'x:P->Q->R,y:P->Q,z:P' 'xz(yz)'
./tm2latex.sh S.lam '' '\x:P->Q->R.\y:P->Q.\z:P.xz(yz)'
./tm2latex.sh S.err '' '\x:P->Q->R.\y:P->Q.\z:P.xzyz'
./tm2latex.sh K.ctx 'x:P,y:Q' 'x'
./tm2latex.sh K.lam '' '\x:P.\y:Q.x'
./tm2latex.sh I.ctx 'x:P' 'x'
./tm2latex.sh I.lam '' '\x:P.x'
./tm2latex.sh MP.ctx 'x:P,y:P->Q' 'yx'
./tm2latex.sh MP.lam '' '\x:P.\y:P->Q.yx'
./tm2latex.sh ZERO '' '\s:A->A.\z:A.z'
./tm2latex.sh SUCC '' '\n:(A->A)->(A->A).\s:A->A.\z:A.s(nsz)'
./tm2latex.sh ADD '' '\m:(A->A)->(A->A).\n:(A->A)->(A->A).\s:A->A.\z:A.ms(nsz)'
./tm2latex.sh MULT '' '\m:(A->A)->(A->A).\n:(A->A)->(A->A).\s:A->A.\z:A.m(ns)z'
./tm2latex.sh Y.err '' '\f:A->A.(\x:A.f(xx))(\x:A.f(xx))'
./tm2latex.sh Y.ctx 'a:A->(A->A),y:(A->A)->A' '\f:A->A.(\x:A.f(axx))(y(\x:A.f(axx)))'
Latex document generation
The ./tm2latex.sh script just calls pdflatex on the output of the
Haskell programs described above. The Haskell programs produce the proof
tree and then wrap it in a minimal Latex document:
unlines
[ "\\documentclass[10pt]{article}"
, "\\usepackage{proof}"
, "\\usepackage{amsmath}"
, "\\usepackage[landscape]{geometry}"
, "\\usepackage[cm]{fullpage}"
-- The most slender font I could find:
-- http://www.tug.dk/FontCatalogue/iwonalc/
, "\\usepackage[light,condensed,math]{iwona}"
, "\\usepackage[T1]{fontenc}"
, "\\begin{document}"
, "\\tiny"
, "\\[" ++ tex ++ "\\]"
, "\\end{document}"
]
As you can see, most of the Latex is devoted to making the proof trees
as small as possible; I plan to also write an ASCII proof tree post
processor, which may be more useful in practice when the examples are
larger.
Conclusion
As always, it takes a bit of code to write a parser, type checker, and
pretty printer. On top of that, adding proof-tree generation is
pretty simple in both versions. This is a fun toy example, but I
expect to do something similar in the context of a "real"
unification-based type checker for a dependently-typed language; there
I expect call tracing and proof-tree generation (in ASCII) to provide
significant help in debugging the type checker.
When I write manually a simple show instance for the PhisicalCell datatype, the program consumes all the space. When deriving his own version of Show, this doesn't happen. Why?
here is a stripped-down version of the code I'm writing:
import Data.Array
type Dimensions = (Int, Int)
type Position = (Int, Int)
data PipeType = Vertical | Horizontal | UpLeft | UpRight | DownLeft | DownRight deriving (Show)
data PhisicalCell = AirCell
| PipeCell PipeType
| DeathCell
| RecipientCell Object
-- deriving (Show) SEE THE PROBLEM BELOW
data Object = Pipe { pipeType :: PipeType -- tipo di tubo
, position :: Position -- posizione del tubo
, movable :: Bool -- se posso muoverlo
}
| Bowl { position :: Position -- posizione dell'angolo in alto a sinistra
, dimensions :: Dimensions -- dimensioni (orizzontale, verticale)
, waterMax :: Int -- quanta acqua puo' contenere al massimo
, waterStart :: Int -- con quanta acqua parte
, hatch :: Maybe Position -- un eventuale casella di sbocco
, sourceIn :: [Position] -- posti da cui l'acqua entra
, movable :: Bool -- se posso muoverlo
}
| Death
deriving (Show)
data Level = Level Dimensions [Object]
type LevelTable = Array Dimensions PhisicalCell
-- HERE IS THE PROBLEM --
instance Show PhisicalCell where
show AirCell = " "
show (PipeCell _) = "P"
show DeathCell = "X"
show (RecipientCell _) = "U"
both :: (a -> b) -> (a,a) -> (b,b)
both f (a,b) = (f a, f b)
levelTable :: Level -> LevelTable
levelTable (Level dim _) = initial
where initial = array ((0,0), both (+1) dim) $
[((x,y), AirCell) | x <- [1..fst dim], y <- [1..snd dim] ]
++ [((x,y), DeathCell) | x <- [0..fst dim + 1], y <- [0, snd dim + 1]]
++ [((x,y), DeathCell) | x <- [0, fst dim + 1], y <- [0..snd dim + 1]]
main = print $ levelTable (Level (8,12) [])
The Show type class has mutually referencing default implementations:
class Show a where
-- | Convert a value to a readable 'String'.
--
-- 'showsPrec' should satisfy the law
-- ...
...
showsPrec _ x s = show x ++ s
show x = shows x ""
showList ls s = showList__ shows ls s
...
shows :: (Show a) => a -> ShowS
shows = showsPrec 0
So if you declare a Show instance without defining any of the methods
instance Show where
nextNewFunction :: Bla
...
GHC will happily compile all the default ones, so there won't be any errors. However, as soon as you try to use any of them, your trapped in a loop as deadly as your Objects... and the mutual recursion will eventually blow the stack.
Now, your code doesn't quite look as if you have such an empty instance Show declarion, but in fact you do: because of the wrong indentation, the show you define there is recognised as a new free top-level function that merely happens to have the same name as GHC.Show.show. You could add
show :: PhisicalCell -> String
to your file and get the same result as now.
The problem does actually lie in the spacing that Sassa NF points out. When I indent the show, it works (and when I don't, I get the stack overflow). Without the indent, you're defining a top-level show function that is never used, and the show function for the Show instance of PhisicalCell has an undefined show function, which causes the problem.