I've specified precedence and associativity like this:
expr
: LB expr RB
| <assoc=right> (SUB | NOT) expr
| expr op=(MULTI | DIV | MOD | AND) expr
| expr op=(ADD | SUB | OR) expr
| expr comparator expr
| expr op=(ANDTHEN | ORELSE) expr
| INTLIT
;
But it also works on ( 1 and 2 ). I want to represent the expression only for integer (i.e., only work on + - * /) or boolean (AND OR). How can I do that?
That's not a precedence issue, it's a type issue and should thus be handled by the type checker.
You might be tempted to separate your grammar into rules such as integerExpression and booleanExpression and it's certainly possibe to create a grammar that rejects 1 and 2 this way. But this approach makes your grammar needlessly complicated and will reach its limits once your language becomes even slightly more powerful. When you introduce variables, for example, you'd want to allow a and b if and only if a and b are both Boolean variables, but that's not something you can tell just by looking at the expression. So in that scenario (and many others), you'll need Java (or whichever language you're using) code to check the types anyway.
So in conclusion, you should leave your grammar as-is and reject 1 and 2 in the type checker.
Related
I am working with Antlr4 at the moment, and I have a confusion with one example. I have to calculate value of expressions in prefix notation which implies following notation:
ADD expr expr OR
SUB expr expr OR
MUL expr expr OR
DIV expr expr OR
Integer OR
Double
(also every expression needs to have ';' at the end of it).
I have written grammar and regular expression for this, but I have a test example of a professor which says ADD 1 2 SUB 1;, which shouldn't even belong to this grammar right? Because for SUB operation I don't have two expressions from the right side? Would be grateful if someone could confirm this for me.
PS. I didn't post the code because for other examples it works, just this one reports error "mismatched input on SUB ';'"
If your expr rule is
expr : 'ADD' expr expr
| 'SUB' expr expr
| 'MUL' expr expr
| 'DIV' expr expr
| Integer
| Double
;
then yes, ADD 1 2 SUB 1 would not match it.
I am trying to build an interpreter for a C-like language in Haskell. I have so far written and combined small monadic parsers following this paper, hence so far I can generate an AST representation of a program. I defined the abstract syntax as follows:
data LangType = TypeReal | TypeInt | TypeBool | TypeString deriving (Show)
type Id = String
data AddOp = Plus | Minus | Or deriving (Show)
data RelOp = LT | GT | LTE | GTE | NEq | Eq deriving (Show)
data MultOp = Mult | Div | And deriving (Show)
data UnOp = UnMinus | UnNot deriving (Show)
data BinOp = Rel RelOp | Mul MultOp | Add AddOp deriving (Show)
data AST = Program [Statement] deriving (Show)
data Block = StatsBlock [Statement] deriving (Show)
data Statement = VariableDecl Id LangType Expression
| Assignment Id Expression
| PrintStatement Expression
| IfStatement Expression Block Block
| WhileStatement Expression Block
| ReturnStatement Expression
| FunctionDecl Id LangType FormalParams Block
| BlockStatement Block
deriving (Show)
data Expression = RealLiteral Double
| IntLiteral Int
| BoolLiteral Bool
| StringLiteral String
| Unary UnOp Expression
| Binary BinOp Expression Expression
| FuncCall Id [Expression]
| Var Id
deriving (Show)
data FormalParams = IdentifierType [(Id, LangType)] deriving (Show)
I have yet to type-check my AST and build the interpreter to evaluate expressions and execute statements. My questions are the following:
Does the abstract syntax make sense/can it be improved? In particular, I've been running into a recurring problem. In the EBNF of this language I'm trying to interpret, a WhileStatement consists of an Expression (which I have no problem with) and a Block, which in the EBNF happens to be a Statement just like WhileStatement, and so I cannot refer to Block from my WhileStatement. I've worked around this by defining a separate data type Block (as is shown in the above code), but am not sure if this is the best way. I'm finding defining data types quite confusing.
Since I have to type-check my AST and evaluate/execute, do I implement these separately or can I define some function which does them both at the same time?
Any general tips on how I should go about type-checking and interpreting the language would also be greatly appreciated. Since the language has variable and function declarations, I am thinking of implementing some sort of symbol table, although again I am struggling with defining the type for this. So far I've tried
import qualified Data.Map as M
data Value = RealLit Double | IntLit Int | BoolLit Bool | StringLit String | Func [FormalParams] String
deriving (Show)
type TermEnv = M.Map String Value
but I'm unsure whether I should be using my LangType from before.
Addressing your question in the comments about how to proceed with type checking and evaluation.
If you don't have to do inference or polymorphism, type checking is pretty simple. Also type checking and evaluation mirror each other pretty closely in these conditions.
Begin by defining a monad with the features you need. For a type checker, you will need
A type environment, i.e. a Reader(Map Id LangType) component, to keep track of the types of local variables.
An error ability, e.g. ExceptString.
So you could define a monad like
type TypeEnv = Map.Map Id LangType
type TC = ReaderT TypeEnv (Except String)
And then your typechecker function would look like:
typeCheck :: AST -> TC ()
(We return () because there is nothing interesting to be gained from the typechecking process besides knowing whether the program passed.)
This will be largely structurally inductive, e.g.
typeCheck (Program stmt) = -- typecheckStmt each statement*
typeCheckStmt :: Statement -> TC ()
typeCheckStmt (VariableDecl v type defn) = ...
typeCheckStmt (Assignment v exp) = do
Just t <- asks (Map.lookup v)
t' <- typeCheckExp exp
when (t /= t') $ throwError "Types do not match"
...
-- Return the type of a composite expression to use elsewhere
typeCheckExp :: Expression -> TC LangType
...
There will be a bit of finesse required to make sure that variable declarations in a list of statements can be seen by later statements in the same list. I will leave that as a puzzle. (Hint: see the local function to provide an updated environment within a scope.)
Evaluation is a similar story. You're correct that you need a type of run-time values. Without some cleverness that you are probably not ready for (and is of questionable utility even if you were) there is not really a way to use LangType in Value, so you're on the right track.
You will need a monad that supports keeping track of the values of variables and the ability to do whatever else your language needs. To start I recommend
type Eval = StateT (Map Id Value) IO
and proceed structurally as before. There will again be some finesse required when handling variable scopes and shadowing, and you may need to change the environment type or mess with your Value type a bit to accommodate these subtleties, but thinking through these problems is important. Start simple, don't try to implement typechecking and evaluation for your whole language at once.
I've recently picked up Haskell at uni and I'm working my way through a set of exercises, here's a snippet of one that I can't make sense of:
"Consider the following grammar for a simple, prefix calculator language:"
num ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
int ::= num | num int
expr ::= int | - expr | + expr expr | * expr expr
I'm confused as how to translate this into Haskell syntax (I'm a complete beginner in both Haskell and functional programming, please be gentle)
I suspect that num, int and expr are all, supposedly, types/values that can be declared using data or type and that they impose constraints on the calculator. However I can't make sense of either: How do I declare type or data(not a variable) for fixed values, namely 0-9? Also, how can I put symbols like - or + in a declaration?
Don't confuse a string in the grammar for the AST that represents it. Compare the string
"+ + 3 4 5"
which is a string in the grammar you've been given with
Plus (Plus (Literal 3) (Literal 4)) (Literal 5)
which would be a sensible Haskell value for the AST that String could get parsed to.
How do I declare type or data(not a variable) for fixed values, namely 0-9?
You can define a type, like
data Digit = Zero | One | Two | Three | Four | Five | Six | Seven | Eight | Nine deriving (Eq, Show)
This represents the num in your problem. Obviously we cannot use 0, 1, 2, 3, ... since they are already interpreted as numbers in Haskell.
Then, you can define
data Number = Single Digit | Many Digit Number deriving (Eq, Show)
which is equivalent to int in your problem. This type represents one (Single ...) or more (Many ...) digits, which together make a one decimal number. For example, with these data types a number 361 would be Many Three (Many Six (Single One)).
Also, how can I put symbols like - or + in a declaration?
There is no way to put those symbols in type or data declarations. You can use, however, names for the operations, like Sum, Sub and Mul. Then the expr of the grammar of your problem would translate to
data Expr = Lit Number
| Sub Expr Expr
| Sum Expr Expr
| Mul Expr Expr
deriving (Eq, Show)
If we would have a string "+ (- (2 5) (1 3)) 3", which represents an expression in the prefix calculator language of your problem , it would be parsed to Sum (Sub (Lit (Many Two (Single Five))) (Lit (Many One (Single Three)))) (Single Three).
If it is just a exercise about modeling data (without code) the answer consist of adding constructor names to your grammar (and changing literal number to names). Something like
data Num = Zero | One | Two | Three | Four | Five
| Six | Seven | Eight | Nine
data Int = Single Num | Multiple Num Int
data Exp = ExpInt Int | ExpMinus Exp Exp | ExpMul Exp Exp
| ExpMul Exp Exp
From that, you can write all sort of code, to parse and evaluate expressions.
Years ago, I got clever, and I declared my AST type an instance of Num, Eq and Ord, then defined the mathematical and comparison operators for AST expressions, so that expr1 + expr2 would yield a valid AST. Using sevenj’s declarations, this would be written like (+) x y = Sum x y, where the right-hand side is the constructor of an AST expression. For brevity, one = Lit One and two = Lit Two. Then you might write one + one == two and the operators would generate your AST with the correct precedence. Between that and abuse of the let { ... } in ... syntax to allow for arbitrary indentation, I had a way to write ASTs that was almost just the toy imperative language itself, with some boilerplate above, below and on the left.
The TA grading my assignment, though, was not amused, and wrote, “This is not Haskell!”
I'm currently trying to build a lambda calculus solver, and I'm having a slight problem with constructing the AST. A lambda calculus term is inductively defined as:
1) A variable
2) A lambda, a variable, a dot, and a lambda expression.
3) A bracket, a lambda expression, a lambda expression and a bracket.
What I would like to do (and at first tried) is this:
data Expr =
Variable
| Abstract Variable Expr
| Application Expr Expr
Now obviously this doesn't work, since Variable is not a type, and Abstract Variable Expr expects types. So my hacky solution to this is to have:
type Variable = String
data Expr =
Atomic Variable
| Abstract Variable Expr
| Application Expr Expr
Now this is really annoying since I don't like the Atomic Variable on its own, but Abstract taking a string rather than an expr. Is there any way I can make this more elegant, and do it like the first solution?
Your first solution is just an erroneous definition without meaning. Variable is not a type there, it's a nullary value constructor. You can't refer to Variable in a type definition much like you can't refer to any value, like True, False or 100.
The second solution is in fact the direct translation of something we could write in BNF:
var ::= <string>
term ::= λ <var>. <term> | <term> <term> | <var>
And thus there is nothing wrong with it.
What you exactly want is to have some type like
data Expr
= Atomic Variable
| Abstract Expr Expr
| Application Expr Expr
But constrain first Expr in Abstract constructor to be only Atomic. There is no straightforward way to do this in Haskell because value of some type can be created by any constructor of this type. So the only approach is to make some separate data type or type alias for existing type (like your Variable type alias) and move all common logic into it. Your solution with Variable seems very ok to me.
But. You can use some other advanced features in Haskell to achieve you goal in different way. You can be inspired by glambda package which uses GADT to create typed lambda calculus. Also see this answer: https://stackoverflow.com/a/39931015/2900502
I can come up with next solution to achieve you minimal goals (if you only want to constrain first argument of Abstract):
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
data AtomicT
data AbstractT
data ApplicationT
data Expr :: * -> * where
Atomic :: String -> Expr AtomicT
Abstract :: Expr AtomicT -> Expr a -> Expr AbstractT
Application :: Expr a -> Expr b -> Expr ApplicationT
And next example works fine:
ex1 :: Expr AbstractT
ex1 = Abstract (Atomic "x") (Atomic "x")
But this example won't compile because of type mismatch:
ex2 :: Expr AbstractT
ex2 = Abstract ex1 ex1
I am trying to teach my self F# by porting some Haskell Code.
Specifily I am trying to port the Countdown Problem shown here
The Haskell Code is listed here
I am trying to create the following Haskell types in F#:
data Op = Add | Sub | Mul | Div
data Expr = Val Int | App Op Expr Expr
In F# I think Op type is defined as follows:
type Op = | Add | Sub | Mul | Div
I am having issues with the Expr type.
How does one create a recursive type? From this SO question it looks like one can not create the Expr type in F#.
Also what is the F# equivalent of 'App' type which apply s the Op type to the Expr type.
If it is not possible to directly port this code, could someone suggest an alternative data structure.
It's not a problem to define recursive types like this; what you can't do is create higher-kinded types, which are parameterized over type constructors (and which are not needed for this example). With any union type definition, you need to separate the constructor name from the constructor parameters with the keyword "of", and the parameters themselves should take the form of a tuple type (i.e. they should be separated by asterisks):
type Op = Add | Sub | Mul | Div
type Expr = Val of int | App of Op * Expr * Expr
#kvb posted the right answer.
See also
F# forward type declarations
for how to do things when you do need mutually recursive types.