The book of anltr4 includes this example, in lableledExpr.g4
https://github.com/jszheng/py3antlr4book/blob/master/04-Calc/LabeledExpr.g4
with these lines:
expr: expr op=('*'|'/') expr # MulDiv
| expr op=('+'|'-') expr # AddSub
... ;
but I can't find the documentation which is the objetive, and meaning of op=('*'|'/')
would that be equivalent to?
expr: expr opmult expr # MulDiv
| expr opplus expr # AddSub
... ;
opmult : ('*'|'/') ;
opplus : ('+'|'-') ;
It’s “similar”, but not the same.
The op=('*'|'/') just means that, if the ’*’ or ’/‘ tokens are encountered they will be available through an op property. This can be used for any part of a parser rule and just allows you to name parts of your rule so that you context object will be easier to work with in your code.
The
expr: expr opmult expr # MulDiv
| expr opplus expr # AddSub
... ;
opmult : ('*'|'/') ;
opplus : ('+'|'-') ;
Will parse the same input, but will put the ’*’ or ’/‘ token into a separate context object.
Look at your generated code each way and the difference will be obvious. (Maybe even try the first rule as just expr ('*'|'/') expr. And think about the code you’d need to determine which operator was used, compared to the code when you have the op=.)
Related
I am working with Antlr4 at the moment, and I have a confusion with one example. I have to calculate value of expressions in prefix notation which implies following notation:
ADD expr expr OR
SUB expr expr OR
MUL expr expr OR
DIV expr expr OR
Integer OR
Double
(also every expression needs to have ';' at the end of it).
I have written grammar and regular expression for this, but I have a test example of a professor which says ADD 1 2 SUB 1;, which shouldn't even belong to this grammar right? Because for SUB operation I don't have two expressions from the right side? Would be grateful if someone could confirm this for me.
PS. I didn't post the code because for other examples it works, just this one reports error "mismatched input on SUB ';'"
If your expr rule is
expr : 'ADD' expr expr
| 'SUB' expr expr
| 'MUL' expr expr
| 'DIV' expr expr
| Integer
| Double
;
then yes, ADD 1 2 SUB 1 would not match it.
I've specified precedence and associativity like this:
expr
: LB expr RB
| <assoc=right> (SUB | NOT) expr
| expr op=(MULTI | DIV | MOD | AND) expr
| expr op=(ADD | SUB | OR) expr
| expr comparator expr
| expr op=(ANDTHEN | ORELSE) expr
| INTLIT
;
But it also works on ( 1 and 2 ). I want to represent the expression only for integer (i.e., only work on + - * /) or boolean (AND OR). How can I do that?
That's not a precedence issue, it's a type issue and should thus be handled by the type checker.
You might be tempted to separate your grammar into rules such as integerExpression and booleanExpression and it's certainly possibe to create a grammar that rejects 1 and 2 this way. But this approach makes your grammar needlessly complicated and will reach its limits once your language becomes even slightly more powerful. When you introduce variables, for example, you'd want to allow a and b if and only if a and b are both Boolean variables, but that's not something you can tell just by looking at the expression. So in that scenario (and many others), you'll need Java (or whichever language you're using) code to check the types anyway.
So in conclusion, you should leave your grammar as-is and reject 1 and 2 in the type checker.
I'm taking a compilers class, and I decided to do it in haskell, but I'm having a hard time setting up the ast. My issue is that I have an Atom class and an Expr class and one instance of the Expr can be an Atom, but when the Expr is immediately an Atom it has an issue. Here is the example:
data Atom -- cannot be reduced farther
= Const Int -- int is value
| Var String -- string is name
deriving (Show) -- So we can print it
data Expr -- add input and the like
= Add Expr Expr -- add is two exprs
| USub Expr -- negation
| Input -- call to input
| Atomic Atom -- or an atomic
deriving (Show) -- So we can print it
data Statement
= Print Expr
| Discard Expr
| Assign String Expr
deriving (Show) -- So we can print it
main = do
let test5 = Print (Const 2)
putStrLn $ show test5
The compiler fails on the Print (Const 2) because it expected an Expr. Is there a fix to this, and is there better vocabular for expressing this problem?
Const 2 is an Atom, but Print takes an Expr as an argument. Luckily, every Atom can be made into an Expr with the Atomic constructor. So:
main = do
let test5 = Print (Atomic (Const 2))
print test5
I'm currently trying to build a lambda calculus solver, and I'm having a slight problem with constructing the AST. A lambda calculus term is inductively defined as:
1) A variable
2) A lambda, a variable, a dot, and a lambda expression.
3) A bracket, a lambda expression, a lambda expression and a bracket.
What I would like to do (and at first tried) is this:
data Expr =
Variable
| Abstract Variable Expr
| Application Expr Expr
Now obviously this doesn't work, since Variable is not a type, and Abstract Variable Expr expects types. So my hacky solution to this is to have:
type Variable = String
data Expr =
Atomic Variable
| Abstract Variable Expr
| Application Expr Expr
Now this is really annoying since I don't like the Atomic Variable on its own, but Abstract taking a string rather than an expr. Is there any way I can make this more elegant, and do it like the first solution?
Your first solution is just an erroneous definition without meaning. Variable is not a type there, it's a nullary value constructor. You can't refer to Variable in a type definition much like you can't refer to any value, like True, False or 100.
The second solution is in fact the direct translation of something we could write in BNF:
var ::= <string>
term ::= λ <var>. <term> | <term> <term> | <var>
And thus there is nothing wrong with it.
What you exactly want is to have some type like
data Expr
= Atomic Variable
| Abstract Expr Expr
| Application Expr Expr
But constrain first Expr in Abstract constructor to be only Atomic. There is no straightforward way to do this in Haskell because value of some type can be created by any constructor of this type. So the only approach is to make some separate data type or type alias for existing type (like your Variable type alias) and move all common logic into it. Your solution with Variable seems very ok to me.
But. You can use some other advanced features in Haskell to achieve you goal in different way. You can be inspired by glambda package which uses GADT to create typed lambda calculus. Also see this answer: https://stackoverflow.com/a/39931015/2900502
I can come up with next solution to achieve you minimal goals (if you only want to constrain first argument of Abstract):
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
data AtomicT
data AbstractT
data ApplicationT
data Expr :: * -> * where
Atomic :: String -> Expr AtomicT
Abstract :: Expr AtomicT -> Expr a -> Expr AbstractT
Application :: Expr a -> Expr b -> Expr ApplicationT
And next example works fine:
ex1 :: Expr AbstractT
ex1 = Abstract (Atomic "x") (Atomic "x")
But this example won't compile because of type mismatch:
ex2 :: Expr AbstractT
ex2 = Abstract ex1 ex1
In the process of writing an interpreter in Haskell for a separate, simple programming language - I find myself butting my head against a wall as I learn typing in Haskell.
I have two custom data types
data Expr
= Var Var
| NumE Int
| NilE
| ConsE Expr Expr
| Plus Expr Expr
| Minus Expr Expr
| Times Expr Expr
| Div Expr Expr
| Equal Expr Expr
| Less Expr Expr
| Greater Expr Expr
| Not Expr
| Isnum Expr
| And Expr Expr
| Or Expr Expr
| Head Expr
| Tail Expr
| Call String
deriving (Show, Read)
data Val = Num Int | Nil | Cons Val Val
deriving (Eq, Show, Read)
and I'm starting to write the cases for interpreting these options, with the function interpret_expr
interpret_expr :: Vars -> Expr -> Val
interpret_expr vars#(Vars a b c d) (NumE integer) = integer
but this COMPLAINS that it couldn't match expected type 'Val' with actual type 'Int' in the expression 'integer'. But say I change it to something silly like
interpret_expr :: Vars -> Expr -> Val
interpret_expr vars#(Vars a b c d) (NumE 'a') = 'a'
it then complains at 'a' that it can't match expected type 'Int' with actual type 'Char'. NOW IT WANTS AN INT?????? I really don't know what to say, I really thought it would be as simple as providing NumE with a variable it could figure is an integer. What am I doing wrong?
In the first case you are returning an Int from a function you declared to return a Val. From your definition of Val it looks like you probably want to return Num integer here.
In the second case the problem is in the pattern matching. (NumE 'a') is an error because NumE is defined as NumE Int, so it must be followed by an Int, not a Char.