How to get the declaration of a function using `reify`?

How to get the declaration of a function using `reify`? - haskell

Function reify allows me to look up information about a given name. For a function the returned value is VarI:
data Info = ... | VarI Name Type (Maybe Dec) Fixity | ...
Here I can examine the function's type, and I'd also like to examine its declaration. However, in the 3rd argument to VarI I always see Nothing. Is there a way to get the function's declaration?

From the template haskell docs on the VarI Info contructor:
A "value" variable (as opposed to a type variable, see TyVarI).
The Maybe Dec field contains Just the declaration which defined the variable -- including the RHS of the declaration -- or else Nothing, in the case where the RHS is unavailable to the compiler. At present, this value is always Nothing: returning the RHS has not yet been implemented because of lack of interest.
Looking at the ghc source mirror on github, the string VarI only appears twice, and both in the compiler/typecheck/TcSplice.lhs implementing the reifyThing function:
reifyThing :: TcTyThing -> TcM TH.Info
-- The only reason this is monadic is for error reporting,
-- which in turn is mainly for the case when TH can't express
-- some random GHC extension
reifyThing (AGlobal (AnId id))
= do { ty <- reifyType (idType id)
; fix <- reifyFixity (idName id)
; let v = reifyName id
; case idDetails id of
ClassOpId cls -> return (TH.ClassOpI v ty (reifyName cls) fix)
_ -> return (TH.VarI v ty Nothing fix)
}
reifyThing (AGlobal (ATyCon tc)) = reifyTyCon tc
reifyThing (AGlobal (ADataCon dc))
= do { let name = dataConName dc
; ty <- reifyType (idType (dataConWrapId dc))
; fix <- reifyFixity name
; return (TH.DataConI (reifyName name) ty
(reifyName (dataConOrigTyCon dc)) fix)
}
reifyThing (ATcId {tct_id = id})
= do { ty1 <- zonkTcType (idType id) -- Make use of all the info we have, even
-- though it may be incomplete
; ty2 <- reifyType ty1
; fix <- reifyFixity (idName id)
; return (TH.VarI (reifyName id) ty2 Nothing fix) }
reifyThing (ATyVar tv tv1)
= do { ty1 <- zonkTcTyVar tv1
; ty2 <- reifyType ty1
; return (TH.TyVarI (reifyName tv) ty2) }
reifyThing thing = pprPanic "reifyThing" (pprTcTyThingCategory thing)
Like the template haskell docs said, the value used for that field is always Nothing.
Digging deaper, this code was added in 2003, in what looks like a rewrite of the reify system. So it does appear to be little interest in getting it working since it has been more than 10 years that field has always had the value Nothing. So I'm guessing if you want the feature you will have to implement it yourself (or propose a good use case to the ghc development mailing list that would encourage someone else to do it).

Related

Debugging and understanding "tying the knot" in a monadic context

I'm trying to implement an interpreter for a programming language with lazy-binding in Haskell.
I'm using the tying-the-knot pattern to implement the evaluation of expressions. However I found it extremely hard to debug and to reason about. I spent at least 40 working on this. I learned a lot about laziness and tying-the-knot, but I haven't reached a solution yet and some behaviors still puzzle me.
Questions
Is there a sensible way to debug the knot and figure out what causes it bottom?
GHC stacktrace (printed when using profiling options) shows which function inside the knot triggers a loop. But that's not helpful: I need to understand what makes the knot strict in the knot's definition, and I couldn't find a way to show this.
It's been really hard to understand why the knot bottoms and I don't think it will be much easier, the next times I have to debug something like this.
How should I tie the knot in a monadic context? I learned that a function like traverse is strict for most types and this causes the knot to bottom.
The only solution I can think of, is to remove the knot. That would increase the problem's complexity (every value would need to be re-computed every time), although this issue can be resolved by caching the value in a STRef: that's exactly what I would do in a strict language. I would prefer to avoid this solution and take advantage of Haskell's laziness, otherwise what's the point of it?
In the code I provide later in this post, why does evalSt e1 terminate, while evalSt e2 doesn't? I can't still understand what's the difference.
Language's AST
I tried to simplify my AST as much as possible, and this is the most minimal definition I could come up with:
data Expr = Int Int | Negate Expr | Id String | Obj (M.Map String Expr)
deriving (Eq, Ord, Show)
pprint :: Expr -> String
pprint e = case e of
Int i -> show i
Negate i -> "(-" ++ pprint i ++ ")"
Id i -> i
Obj obj -> "{" ++ intercalate ", "
[ k ++ ":" ++ pprint v | (k,v) <- M.toList obj ] ++ "}"
Example programs
Here are a couple of example expressions represented with the AST above:
-- expression: {a:{aa1:(-b), aa2:ab, ab:(-b)}, b:3}
-- should evalutae to: {a:{aa1:-3, aa2:-3, ab:-3 }, b:3}
e1 = Obj $ M.fromList [
("a", Obj $ M.fromList [
("aa1", Negate $ Id "b"),
("aa2", Id "ab"),
("ab", Negate $ Id "b")
]),
("b", Int 3)
]
-- expression: {a:{aa:(-ab), ab:b}, b:3}
-- should evaluate to: {a:{aa:-3, ab:3}, b:3}
e2 = Obj $ M.fromList [
("a", Obj $ M.fromList [
("aa", Negate $ Id "ab"),
("ab", Id "b")
]),
("b", Int 3)
]
Pure eval function
I have then defined a function to evaluate an expression. This is the most simple definition I could write:
type Scope = M.Map String Expr
eval :: Scope -> Expr -> Expr
eval scope expr = case expr of
Int i -> Int i
Id str -> case M.lookup str scope of
Just e -> e
Nothing -> error $ str ++ " not in scope"
Negate aE -> do
case (eval scope aE) of
Int i -> Int $ -i
_ -> error $ "Can only negate ints. Found: " ++ pprint aE
Obj kvMap -> Obj $
let resMap = fmap (eval (M.union resMap scope)) kvMap
in resMap
Tying the knot
The most interesting part in the eval function is the tying the knot in the Obj kvMap case:
let resMap = fmap (eval (M.union resMap scope)) kvMap
in resMap
The idea is that in order to compute the expressions in kvMap, the identifiers need to be able to access both the values in scope and the results of the expressions in kvMap. The computed values are resMap, and to compute them we use the scope resMap ⋃ scope.
It works!
This eval function works as expected:
GHCi> pprint $ eval M.empty e1
"{a:{aa1:-3, aa2:-3, ab:-3}, b:3}"
GHCi> pprint $ eval M.empty e2
"{a:{aa:-3, ab:3}, b:3}"
Monadic evaluation
The limitation of the eval function above, is that it's pure. In some cases I need to evaluate expressions in a monadic context. For instance I may need IO to offer non-pure functions to the guest language.
I've implemented dozens of versions of eval (both monadic, using RecursiveDo, and of various degrees of purity) in an attempt to understand the issues. I'm presenting the two most interesting ones:
Passing the scope through a State monad
evalSt' :: Expr -> State Scope Expr
evalSt' expr = do
scope <- get
case expr of
Int i -> pure $ Int i
Id str -> case M.lookup str scope of
Just e -> pure e
Nothing -> error $ str ++ " not in scope"
Negate aE -> do
a <- evalSt' aE
case a of
Int i -> pure $ Int $ -i
_ -> error $ "Can only negate ints. Found: " ++ pprint aE
Obj obj -> mdo
put $ M.union newScope scope
newScope <- traverse evalSt' obj
put scope
pure $ Obj newScope
evalSt scope expr = evalState (evalSt' expr) scope
This function is able to evaluate the program e1, but it bottoms (never return) on e2:
GHCi> pprint $ evalSt M.empty e1
"{a:{aa1:-3, aa2:-3, ab:-3}, b:3}"
GHCi> pprint $ evalSt M.empty e2
"{a:{aa:
I still don't understand how it can compute e1, since it does contain Ids: isn't that program strict on the scope and shouldn't it bottom evalSt? Why it doesn't? And what's different in e2 to cause the function the function to terminate?
Evaluating in the IO monad
evalM :: Scope -> Expr -> IO Expr
evalM scope expr = case expr of
Int i -> pure $ Int i
Id str -> case M.lookup str scope of
Just e -> pure e
Nothing -> error $ str ++ " not in scope"
Negate aE -> do
a <- evalM scope aE
case a of
Int i -> pure $ Int $ -i
_ -> error $ "Can only negate ints. Found: " ++ pprint aE
Obj kvMap -> mdo
resMap <- traverse (evalM (M.union resMap scope)) kvMap
pure $ Obj resMap
This function always bottoms (never returns) on every program that uses at least one Id node. Even just {a:1, b:a}.
Scroll back to the top for the questions :-)

How should I tie the knot in a monadic context?
Your pure evaluation function relies on there being no evaluation order in the semantics of Haskell, so that thunks get forced only when needed. In contrast, most effects are fundamentally ordered, so there is an incompatibility there.
Some monads are lazier than the others, and for those you can get some result out of making your evaluation function monadic, as you've seen with evalSt e1. The two most common lazy monads are Reader and lazy State (which is the one you get from Control.Monad.State, as opposed to Control.Monad.State.Strict).
But for other effects, such as IO, you must control evaluation order explicitly, and that means implementing the cache for lazy evaluation explicitly (via STRef for example), instead of implicitly relying on Haskell's own runtime.
In the code I provide later in this post, why does evalSt e1 terminate, while evalSt e2 doesn't? I can't still understand what's the difference.
To see what is going wrong, unfold traverse evalSt' obj where obj is {aa:(-ab), ab:b}.
traverse evalSt' obj
=
do
x <- evalSt' (Negate (Id "ab"))
y <- evalSt' (Id "b")
pure [("aa", x), ("ab", y)]
=
do
-- evalSt' (Negate (Id "ab"))
scope1 <- get -- unused
a <- evalSt' (Id "ab")
x <- case a of
Int i -> pure $ Int $ -i
_ -> error ...
-- evalSt' (Id "b")
scope2 <- get
y <- case M.lookup "b" scope2 of
Just e -> pure e
Nothing -> error ...
pure [("aa", x), ("ab", y)]
We try to print the object e2, that ends up looking at the value of the "aa" field, which is x in the code above.
x comes from case a of ..., which needs a.
a comes from evalSt' (Id "ab"), which needs the field "ab", which is y (from the knot tying surrounding the traverse evalSt' obj we are looking at).
y comes from case M.lookup "b" scope2 of ..., which needs scope2.
scope2 comes from get, which gets the output state from the action preceding it, which is evaluating x.
We are already trying to evaluate x (from step 2). Hence there is an infinite loop.
This can be fixed by always restoring the state at the end of evalSt' (technically, you only need to do this for Id and Negate, but might as well do it always):
evalSt' e = do
scope <- get
v <- case e of ...
put scope
pure v
Or use Reader instead, which gives you the power to update state locally for subcomputations, which is exactly what you need here. You can use local to surround traverse evalSt' obj:
newScope <- local (const (newScope `M.union` scope)) (traverse evalSt' obj)
Is there a sensible way to debug the knot and figure out what causes it bottom?
I don't have a good answer to this. I'm not familiar with debugging tools in Haskell.
You cannot rely on stack traces because subexpressions may force each other in a rather chaotic order. And there is something interfering with print-debugging (Debug.Trace) that I don't understand. (I would add Debug.Trace.trace (pprint expr) $ do at the beginning of evalSt', but then the trace doesn't make sense to me because things that should be printed once are replicated many times.)

Understanding Purescript Eff Monad and do blocks

I'm trying to understand why the following does not work in Purescript. I have a feeling it can also be answered by the Haskell community, thus I've cross listed it.
The general gist is:
If I have a do block, can I not throw in a disposable value? In this instance, I'm trying to log something (similar to Haskell's print) in the middle of a series of monadic computations.
main = do
a <- someAction1
b <- someAction2
_ <- log "here is a statement I want printed"
someAction3 a b
Specifically, I have a function which takes the following (from the Halogen example template project)
data Query a = ToggleState a
eval :: Query ~> H.ComponentDSL State Query g
eval (Toggle next) = do
H.modify (\state -> state { isOn = not state.isOn })
_ <- log "updating the state!"
pure next
In my mind, this should work like in Haskell
barf :: IO Int
barf = do
_ <- print "Here I am!"
return 5
main :: IO ()
main = do
a <- barf
_ <- print $ "the value is: " ++ (show a)
print "done"
Specifically, the error that I get is type mismatch of the monads
Could not match type Eff with type Free while trying to match type Eff ( "console" :: CONSOLE | t6 ) with type Free (HalogenFP t0 { "isOn" :: t1 | t2 } t3 t4) ... etc...
I know purescript makes me declare the "things I'm touching in the monad" (i.e. forall e. Eff ( a :: SOMEVAR, b :: SOMEOTHERVAR | eff ) Unit, but I'm not sure how to do that in this case...

If you're working with version 0.12.0 of halogen you should be able to use fromEff from https://pursuit.purescript.org/packages/purescript-aff-free/3.0.0/docs/Control.Monad.Aff.Free#v:fromEff like so:
data Query a = ToggleState a
eval :: Query ~> H.ComponentDSL State Query g
eval (Toggle next) = do
H.modify (\state -> state { isOn = not state.isOn })
_ <- H.fromEff (log "updating the state!")
pure next
This is going to get a lot nicer in upcoming versions of halogen (>= 0.13) in which liftEff should be enough.
The reason for why you can't just use log right away, is that H.ComponentDSL is not a type synonym for Eff, but for Free and so you can't simply mix Eff and ComponentDSL actions.

Error in the haskell program

I have made the following haskell program which will do some basic operations of load, read and increment. I am getting a type error. Can someone please tell why type error is there and how can i resolve it.
module ExampleProblem (Value,read',load,incr) where
newtype Value a = Value Int deriving (Eq,Read,Show)
read':: Value Int -> Int
read' (Value a) = a
load:: Int -> Value Int
load a = Value a
incr:: Value Int -> Value Int
incr (Value a) = Value (a+1)
main = do
(Value ab) <- (load 42)
if (read'( Value ab) /= 42)
then show "Failure to load"
else do
Value b <- incr( Value ab)
Value c <- incr( Value b)
if ((Value c) == Value 44)
then show "Example finished"
else show "error"
return
The error i get is:
Couldn't match expected type `Int' with actual type `Value t0'
In the pattern: Value ab
In a stmt of a 'do' expression: (Value ab) <- (load 42)
In the expression:
do { (Value ab) <- (load 42);
if (read' (Value ab) /= 42) then
show "Failure to load"
else
do { Value b <- incr (Value ab);
.... } }
And when i made a separate file in which i had written the main function i was getting the error of scope although i was importing the module ExampleProblem.
Not in scope: data constructor `Value'

It seems you are confused about how to use do-notation. Do-notation is used for composing actions in a monad. In the case of main, that monad will be IO so I'll be sticking with IO to keep things simple.
There are three types of statements you can use in do-notation:
x <- foo binds the result of running the action foo to the pattern x. foo must have type IO Something, and x will then have the corresponding type Something.
let x = foo binds a value without anything special happening. This is the same as = at the top level, except that any preceding bindings are in scope.
foo runs an action of type IO Something. If it's the last statement in the do-block, this becomes the result of the block, otherwise the result is ignored.
The main problem in your code is that you're using x <- foo statements with things that aren't IO actions. That won't work. Instead, use the let x = foo form.
Secondly, show is also not an IO action. It's just a function that converts stuff to String. You probably meant to use putStrLn which will print a string to standard output.
Thirdly, return is not a statement like in C or Java. It's a function which, given a value, produces an action that does nothing and returns the value. It is often used as the last thing in a do-block when you want it to return a pure value. Here it is unnecessary.
And finally, if you want to run this code, your module must be called Main and it must export the main function. The easiest way of doing this is just to remove the module ... where line and the name will default to Main. You typically only want this line with the in modules in your project which don't contain main.
main = do
let Value ab = load 42
if read' (Value ab) /= 42
then putStrLn "Failure to load"
else do
let Value b = incr (Value ab)
let Value c = incr (Value b)
if Value c == Value 44
then putStrLn "Example finished"
else putStrLn "error"
This should work, but you're needlessly wrapping and unwrapping your values in the Value type. Perhaps you intended something like this:
main = do
let ab = load 42
if read' ab /= 42
then putStrLn "Failure to load"
else do
let b = incr ab
let c = incr b
if c == Value 44
then putStrLn "Example finished"
else putStrLn "error"

I'll start with your second question.
newtype Value a = Value Int deriving (Eq,Read,Show)
This actually creates two Values:
the Value in newtype Value a is a type constructor
the Value in Value Int is a data constructor (usually just call it a constructor)
They are different things!
module ExampleProblem (Value,read',load,incr) where
Here Value means the type constructor. To export the data constructor as well, you need
module ExampleProblem (Value(Value),read',load,incr) where
Now, about your first question.
main must have type IO something
You've set main to be a do block, so that means
everything to the right of an <- must have type IO somethingOrOther
The line with the error message is
(Value ab) <- (load 42)
load 42 has type Value Int, clearly nothing to do with IO, so you get an error.
So how do you fix this?
if a line of code in a do block is not an IO statement, it must be a let statement1
Other errors you need to fix:
return doesn't do what it does in every other language.
In particular, it always takes a value to return.
We should have called it something else. Sorry about that.
Imagine it's called pure instead. Anyway, you don't need it here.
to print a String to the screen, you should use putStrLn, not show.
Ghci prints the return value of expressions you give it, but this is a program in its own right,
so you have to do the IO yourself.
then and else need to be indented further than the if (I think this rule is changing, but I don't think it has yet)
So we end up with
main = do
let Value ab = load 42
if read' (Value ab) /= 42
then putStrLn "Failure to load"
else do
let Value b = incr (Value ab)
let Value c = incr (Value b)
if Value c == Value 44
then putStrLn "Example finished"
else putStrLn "error"
Footnotes:
Not strictly true: do blocks can be used for things other than IO. But you'll learn about that later.

One additional remark: You should try to do as much work outside the main-IO thing as possible. In your case it's easy: You have a calculation which takes no argument and produces a String (which should be printed out). If you had different "outcomes", you could use e.g. Either, but here we are fine with just String as return type.
As hammar pointed out it makes not much sense to pattern match and reconstruct Value all the way, just use the values as they are, and use pattern matching only if you need to access the number inside.
The type Value doesn't need to be polymorphic if it always wraps just an Int, so I dropped the a. Else you have something called a "phantom type", which is possible and sometimes even useful, but definitely not here (of course if you want to be able to wrap arbitrary types, you can write data Value a = Value a).
So here is a version which does only the bare minimum in IO, and keeps everything else pure (and hence flexible, testable etc):
data Value = Value Int deriving (Eq,Read,Show)
read':: Value -> Int
read' (Value a) = a
load:: Int -> Value
load a = Value a
incr:: Value -> Value
incr (Value a) = Value (a+1)
main = putStrLn calc
calc :: String
calc = let ab = load 42
in if read' ab /= 42 then "Failure to load" else increaseTwice ab
increaseTwice :: Value -> String
increaseTwice v = let b = incr v
c = incr b
in if c == Value 44 then "Example finished" else "error"
I can't use Haskell here, so I hope this works...

User state in Parsec

I'm parsing an expression using Parsec and I want to keep track of variables in these expressions using the user state in Parsec. Unfortunately I don't really get how to do it.
Given the following code:
import Data.Set as Set
inp = "$x = $y + $z"
data Var = V String
var = do char '$'
n <- many1 letter
let v = Var n
-- I want to modify the set of variables here
return v
parseAssignment = ... -- parses the above assignment
run = case runIdentity $ runParserT parseAssignment Set.empty "" inp of
Left err -> ...
Right -> ...
So, the u in ParsecT s u m a would be Set.Set. But how would I integrate the state update into var?
I tried something like modify $ Set.insert v, but this doesn't work, since Set.Set is not a state monad.

Unfortunately, Yuras' suggestion of updateParserState is not optimal (you'd use that function if you're looking to modify Parsec's internal state as well); instead you should pass a function that works over your custom user state (i.e. of type u -> u) to modifyState, such as in this example:
expr = do
x <- identifier
modifyState (+1)
-- ^ in this example, our type u is Int
return (Id x)
or use any combination of the getState and putState functions. For your case, you'd do something like:
modifyState (Set.insert v)
See this link for more info.
For a more tutorial-like introduction to working with user state in Parsec, this document, though old, should be relevant.

You can use updateParserState

Code generation for compiler in Haskell

I am writing a compiler for a small imperative language. The target language is Java bytecode, and the compiler is implemented in Haskell.
I've written a frontend for the language - i.e I have a lexer, parser and typechecker. I'm having trouble figuring out how to do code generation.
I keep a data structure representing the stack of local variables. I can query this structure with the name of a local variable and get its position in the stack. This data structure is passed around as I walk the syntax tree, and variables are popped and pushed as I enter and exit new scopes.
What I having trouble figuring out is how to emit the bytecode. Emitting strings at terminals and concatenating them at higher levels seems like a poor solution, both clarity- and performance-wise.
tl;dr How do I emit bytecode while waling the syntax tree?

My first project in Haskell a few months back was to write a c compiler, and what resulted was a fairly naive approach to code generation, which I'll walk through here. Please do not take this as an example of good design for a code generator, but rather view it as a quick and dirty (and ultimately naive) way to get something that works fairly quickly with decent performance.
I began by defining an intermediate representation LIR (Lower Intermediate Representation) which closely corresponded to my instruction set (x86_64 in my case):
data LIRInst = LIRRegAssignInst LIRReg LIRExpr
| LIRRegOffAssignInst LIRReg LIRReg LIRSize LIROperand
| LIRStoreInst LIRMemAddr LIROperand
| LIRLoadInst LIRReg LIRMemAddr
| LIREnterInst LIRInt
| LIRJumpLabelInst LIRLabel
| LIRIfInst LIRRelExpr LIRLabel LIRLabel -- false, then true
| LIRCallInst LIRLabel LIRLabel -- method label, return label
| LIRCalloutInst String
| LIRRetInst [LIRLabel] String -- list of successors, and the name of the method returning from
| LIRLabelInst LIRLabel
deriving (Show, Eq, Typeable)
Next up came a monad that would handle interleaving state throughout the translation (I was blissfully unaware of our friend-the State Monad-at the time):
newtype LIRTranslator a = LIRTranslator
{ runLIR :: Namespace -> (a, Namespace) }
instance Monad LIRTranslator where
return a = LIRTranslator (\s -> (a, s))
m >>= f = LIRTranslator (\s ->
let (a, s') = runLIR m s
in runLIR (f a) s')
along with the state that would be 'threaded' through the various translation phases:
data Namespace = Namespace
{ temp :: Int -- id's for new temporaries
, labels :: Int -- id's for new labels
, scope :: [(LIRLabel, LIRLabel)] -- current program scope
, encMethod :: String -- current enclosing method
, blockindex :: [Int] -- index into the SymbolTree
, successorMap :: Map.Map String [LIRLabel]
, ivarStack :: [(LIRReg, [CFGInst])] -- stack of ivars (see motioned code)
}
For convenience, I also specified a series of translator monadic functions, for example:
-- |Increment our translator's label counter
incLabel :: LIRTranslator Int
incLabel = LIRTranslator (\ns#(Namespace{ labels = l }) -> (l, ns{ labels = (l+1) }))
I then proceeded to recursively pattern-match my AST, fragment-by-fragment, resulting in many functions of the form:
translateBlock :: SymbolTree -> ASTBlock -> LIRTranslator [LIRInst]
translateBlock st (DecafBlock _ [] _) = withBlock (return [])
translateBlock st block =
withBlock (do b <- getBlock
let st' = select b st
declarations <- mapM (translateVarDeclaration st') (blockVars block)
statements <- mapM (translateStm st') (blockStms block)
return (concat declarations ++ concat statements))
(for translating a block of the target language's code) or
-- | Given a SymbolTree, Translate a single DecafMethodStm into [LIRInst]
translateStm st (DecafMethodStm mc _) =
do (instructions, operand) <- translateMethodCall st mc
final <- motionCode instructions
return final
(for translating a method call) or
translateMethodPrologue :: SymbolTree -> DecafMethod -> LIRTranslator [LIRInst]
translateMethodPrologue st (DecafMethod _ ident args _ _) =
do let numRegVars = min (length args) 6
regvars = map genRegVar (zip [LRDI, LRSI, LRDX, LRCX, LR8, LR9] args)
stackvars <- mapM genStackVar (zip [1..] (drop numRegVars args))
return (regvars ++ stackvars)
where
genRegVar (reg, arg) =
LIRRegAssignInst (symVar arg st) (LIROperExpr $ LIRRegOperand reg)
genStackVar (index, arg) =
do let mem = LIRMemAddr LRBP Nothing ((index + 1) * 8) qword -- ^ [rbp] = old rbp; [rbp + 8] = ret address; [rbp + 16] = first stack param
return $ LIRLoadInst (symVar arg st) mem
for an example of actually generating some LIR code. Hopefully these three examples will give you a good starting point; ultimately, you'll want to go slowly, focusing on one fragment (or intermediate type) within your AST at a time.

If you haven't done this before, you can do it in small passes:
1) for every statement produce some byte code (with out properly addressed memory locations)
2) after that is done, if you have looping, gotos, etc, put in the real addresses (you know them
now that you have it all layed out)
3) replace the memory fetches/stores with the correct locations
4) dump it out to a JAR file
Note that this is very simplified and doesn't try to do any performance optimisation. It will give you a functional program which will execute. This also assumes you know the codes for the JVM (which is where I am presuming you are going to execute it.)
To start, just have a subset of the language which does sequential arithmetic statements. This will allow you to figure out how to map variable memory locations to statements via the parse tree. Next add some looping to get jumps to work. Similarly add conditionals. Finally, you can add the final parts of your language.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to get the declaration of a function using `reify`? - haskell

Related

Debugging and understanding "tying the knot" in a monadic context

Understanding Purescript Eff Monad and do blocks

Error in the haskell program

User state in Parsec

Code generation for compiler in Haskell

Categories

Resources