Prefix notation in ANTLR4 - antlr4

I am working with Antlr4 at the moment, and I have a confusion with one example. I have to calculate value of expressions in prefix notation which implies following notation:
ADD expr expr OR
SUB expr expr OR
MUL expr expr OR
DIV expr expr OR
Integer OR
Double
(also every expression needs to have ';' at the end of it).
I have written grammar and regular expression for this, but I have a test example of a professor which says ADD 1 2 SUB 1;, which shouldn't even belong to this grammar right? Because for SUB operation I don't have two expressions from the right side? Would be grateful if someone could confirm this for me.
PS. I didn't post the code because for other examples it works, just this one reports error "mismatched input on SUB ';'"

If your expr rule is
expr : 'ADD' expr expr
| 'SUB' expr expr
| 'MUL' expr expr
| 'DIV' expr expr
| Integer
| Double
;
then yes, ADD 1 2 SUB 1 would not match it.

Related

Trying to add deriving(Show, Read) to an expression tree

I am fairly new to Haskell so it is probably something simple that I am missing but I have an expression tree that looks like this:
data Expression = Lit Float
| Add Expression Expression
| Mul Expression Expression
| Sub Expression Expression
| Div Expression Expression
And that code works perfectly fine but then when I try to add a deriving(Show, Read) so that Haskell automatically writes code to read and write elements of this type it throws an error.
This is what I am trying to do.
Lit Float deriving(Show, Read)
I get an error that reads error: parse error on input '|', and now the Add Expression Expression line doesn't work. Could someone point out to me what the error is here?
the deriving clause has to go after the complete type definition:
data Expression = Lit Float
| Add Expression Expression
| Mul Expression Expression
| Sub Expression Expression
| Div Expression Expression
deriving (Read, Show)
in what you were trying, presumably
data Expression = Lit Float deriving (Read, Show)
| Add Expression Expression
| Mul Expression Expression
| Sub Expression Expression
| Div Expression Expression
Haskell comes to the deriving clause and assumes the type definition has finished and something else is coming after. And then the | character makes no sense.
You derive instances - or indeed write your own instances - for a type, not for individual constructors for that type.

Undocummented operator in ANTLR4 grammar: expr op=('*'|'/') expr

The book of anltr4 includes this example, in lableledExpr.g4
https://github.com/jszheng/py3antlr4book/blob/master/04-Calc/LabeledExpr.g4
with these lines:
expr: expr op=('*'|'/') expr # MulDiv
| expr op=('+'|'-') expr # AddSub
... ;
but I can't find the documentation which is the objetive, and meaning of op=('*'|'/')
would that be equivalent to?
expr: expr opmult expr # MulDiv
| expr opplus expr # AddSub
... ;
opmult : ('*'|'/') ;
opplus : ('+'|'-') ;
It’s “similar”, but not the same.
The op=('*'|'/') just means that, if the ’*’ or ’/‘ tokens are encountered they will be available through an op property. This can be used for any part of a parser rule and just allows you to name parts of your rule so that you context object will be easier to work with in your code.
The
expr: expr opmult expr # MulDiv
| expr opplus expr # AddSub
... ;
opmult : ('*'|'/') ;
opplus : ('+'|'-') ;
Will parse the same input, but will put the ’*’ or ’/‘ token into a separate context object.
Look at your generated code each way and the difference will be obvious. (Maybe even try the first rule as just expr ('*'|'/') expr. And think about the code you’d need to determine which operator was used, compared to the code when you have the op=.)

How to separate precedence and expression in ANTLR4

I've specified precedence and associativity like this:
expr
: LB expr RB
| <assoc=right> (SUB | NOT) expr
| expr op=(MULTI | DIV | MOD | AND) expr
| expr op=(ADD | SUB | OR) expr
| expr comparator expr
| expr op=(ANDTHEN | ORELSE) expr
| INTLIT
;
But it also works on ( 1 and 2 ). I want to represent the expression only for integer (i.e., only work on + - * /) or boolean (AND OR). How can I do that?
That's not a precedence issue, it's a type issue and should thus be handled by the type checker.
You might be tempted to separate your grammar into rules such as integerExpression and booleanExpression and it's certainly possibe to create a grammar that rejects 1 and 2 this way. But this approach makes your grammar needlessly complicated and will reach its limits once your language becomes even slightly more powerful. When you introduce variables, for example, you'd want to allow a and b if and only if a and b are both Boolean variables, but that's not something you can tell just by looking at the expression. So in that scenario (and many others), you'll need Java (or whichever language you're using) code to check the types anyway.
So in conclusion, you should leave your grammar as-is and reject 1 and 2 in the type checker.

How to translate logical notation to Haskell syntax

I've recently picked up Haskell at uni and I'm working my way through a set of exercises, here's a snippet of one that I can't make sense of:
"Consider the following grammar for a simple, prefix calculator language:"
num ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
int ::= num | num int
expr ::= int | - expr | + expr expr | * expr expr
I'm confused as how to translate this into Haskell syntax (I'm a complete beginner in both Haskell and functional programming, please be gentle)
I suspect that num, int and expr are all, supposedly, types/values that can be declared using data or type and that they impose constraints on the calculator. However I can't make sense of either: How do I declare type or data(not a variable) for fixed values, namely 0-9? Also, how can I put symbols like - or + in a declaration?
Don't confuse a string in the grammar for the AST that represents it. Compare the string
"+ + 3 4 5"
which is a string in the grammar you've been given with
Plus (Plus (Literal 3) (Literal 4)) (Literal 5)
which would be a sensible Haskell value for the AST that String could get parsed to.
How do I declare type or data(not a variable) for fixed values, namely 0-9?
You can define a type, like
data Digit = Zero | One | Two | Three | Four | Five | Six | Seven | Eight | Nine deriving (Eq, Show)
This represents the num in your problem. Obviously we cannot use 0, 1, 2, 3, ... since they are already interpreted as numbers in Haskell.
Then, you can define
data Number = Single Digit | Many Digit Number deriving (Eq, Show)
which is equivalent to int in your problem. This type represents one (Single ...) or more (Many ...) digits, which together make a one decimal number. For example, with these data types a number 361 would be Many Three (Many Six (Single One)).
Also, how can I put symbols like - or + in a declaration?
There is no way to put those symbols in type or data declarations. You can use, however, names for the operations, like Sum, Sub and Mul. Then the expr of the grammar of your problem would translate to
data Expr = Lit Number
| Sub Expr Expr
| Sum Expr Expr
| Mul Expr Expr
deriving (Eq, Show)
If we would have a string "+ (- (2 5) (1 3)) 3", which represents an expression in the prefix calculator language of your problem , it would be parsed to Sum (Sub (Lit (Many Two (Single Five))) (Lit (Many One (Single Three)))) (Single Three).
If it is just a exercise about modeling data (without code) the answer consist of adding constructor names to your grammar (and changing literal number to names). Something like
data Num = Zero | One | Two | Three | Four | Five
| Six | Seven | Eight | Nine
data Int = Single Num | Multiple Num Int
data Exp = ExpInt Int | ExpMinus Exp Exp | ExpMul Exp Exp
| ExpMul Exp Exp
From that, you can write all sort of code, to parse and evaluate expressions.
Years ago, I got clever, and I declared my AST type an instance of Num, Eq and Ord, then defined the mathematical and comparison operators for AST expressions, so that expr1 + expr2 would yield a valid AST. Using sevenj’s declarations, this would be written like (+) x y = Sum x y, where the right-hand side is the constructor of an AST expression. For brevity, one = Lit One and two = Lit Two. Then you might write one + one == two and the operators would generate your AST with the correct precedence. Between that and abuse of the let { ... } in ... syntax to allow for arbitrary indentation, I had a way to write ASTs that was almost just the toy imperative language itself, with some boilerplate above, below and on the left.
The TA grading my assignment, though, was not amused, and wrote, “This is not Haskell!”

Haskell AST with recursive types

I'm currently trying to build a lambda calculus solver, and I'm having a slight problem with constructing the AST. A lambda calculus term is inductively defined as:
1) A variable
2) A lambda, a variable, a dot, and a lambda expression.
3) A bracket, a lambda expression, a lambda expression and a bracket.
What I would like to do (and at first tried) is this:
data Expr =
Variable
| Abstract Variable Expr
| Application Expr Expr
Now obviously this doesn't work, since Variable is not a type, and Abstract Variable Expr expects types. So my hacky solution to this is to have:
type Variable = String
data Expr =
Atomic Variable
| Abstract Variable Expr
| Application Expr Expr
Now this is really annoying since I don't like the Atomic Variable on its own, but Abstract taking a string rather than an expr. Is there any way I can make this more elegant, and do it like the first solution?
Your first solution is just an erroneous definition without meaning. Variable is not a type there, it's a nullary value constructor. You can't refer to Variable in a type definition much like you can't refer to any value, like True, False or 100.
The second solution is in fact the direct translation of something we could write in BNF:
var ::= <string>
term ::= λ <var>. <term> | <term> <term> | <var>
And thus there is nothing wrong with it.
What you exactly want is to have some type like
data Expr
= Atomic Variable
| Abstract Expr Expr
| Application Expr Expr
But constrain first Expr in Abstract constructor to be only Atomic. There is no straightforward way to do this in Haskell because value of some type can be created by any constructor of this type. So the only approach is to make some separate data type or type alias for existing type (like your Variable type alias) and move all common logic into it. Your solution with Variable seems very ok to me.
But. You can use some other advanced features in Haskell to achieve you goal in different way. You can be inspired by glambda package which uses GADT to create typed lambda calculus. Also see this answer: https://stackoverflow.com/a/39931015/2900502
I can come up with next solution to achieve you minimal goals (if you only want to constrain first argument of Abstract):
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
data AtomicT
data AbstractT
data ApplicationT
data Expr :: * -> * where
Atomic :: String -> Expr AtomicT
Abstract :: Expr AtomicT -> Expr a -> Expr AbstractT
Application :: Expr a -> Expr b -> Expr ApplicationT
And next example works fine:
ex1 :: Expr AbstractT
ex1 = Abstract (Atomic "x") (Atomic "x")
But this example won't compile because of type mismatch:
ex2 :: Expr AbstractT
ex2 = Abstract ex1 ex1

Resources