Cannot generate Xtext artifacts for grammar which uses parentheses and cross-references at once - dsl

I'm trying to generate DSL from this grammar:
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Program:
"print" expression=Expression "where" constant=Constant |
"print" expression=Expression;
Expression:
Add;
Add returns Expression:
Primary({Add.expression1=current} "+" expression2=Primary)*;
Primary returns Expression:
ExpressionParentheses | Number | ConstUsage;
Number returns Expression:
value=INT;
Constant:
name=ID "=" number=Number;
ConstUsage returns Expression:
name=[Constant];
ExpressionParentheses returns Expression:
"(" Add ")";
But generating Xtext artifacts in Eclipse always produces an error. It occurs always when I'm using ExpressionParentheses and also ConstUsage in Primary rule at once. When I put there either ConstUsage or ExpressionParentheses, everything works fine. What could be the problem with my grammar ?

thw following grammar works fine
Program:
"print" expression=Expression ("where" constant=Constant)?
;
Expression:
Add;
Add returns Expression:
Primary({Add.expression1=current} "+" expression2=Primary)*;
Primary returns Expression:
Number | ConstUsage | "("Add")";
Number returns Expression:
value=INT;
Constant:
name=ID "=" number=Number;
ConstUsage returns Expression:
name=[Constant];

Related

Solving this Shift/Reduce Conflict in Happy/Bison

I am making a simple parser in Happy (Bison equivalent for Haskell) and I stumbled upon a shift/reduce conflict in these rules:
ClassBlock :
"{" ClassAttributes ClassConstructor ClassFunctions "}" {ClassBlock $2 $3 $4}
ClassAttributes :
{- empty -} { ClassAttributesEmpty }
| ClassAttributes ClassAttribute {ClassAttributes $1 $2}
ClassAttribute :
"[+]" Variable {ClassAttributePublic $2 }
| "[-]" Variable {ClassAttributePrivate $2 }
ClassFunctions :
{- empty -} { ClassFunctionsEmpty }
| ClassFunctions ClassFunction {ClassFunctions $1 $2}
ClassFunction :
"[+]" Function {ClassFunctionPublic $2}
| "[-]" Function {ClassFunctionPrivate $2}
ClassConstructor :
{- empty -} { ClassConstructorEmpty }
| TypeFuncParams var_identifier Params Block {ClassConstructor $1 $2 $3 $4}
TypeFuncParams :
Primitive ClosingBracketsNoIdentifier { TypeFuncParamsPrimitive $1 $2}
| class_identifier ClosingBracketsNoIdentifier { TypeFuncParamsClassId $1 $2}
| ListType {TypeFuncParamsList $1}
The info file states the shift/reduce conflict:
ClassBlock -> "{" ClassAttributes . ClassConstructor ClassFunctions "}" (rule 52)
ClassAttributes -> ClassAttributes . ClassAttribute (rule 54)
"[+]" shift, and enter state 85
(reduce using rule 61)
"[-]" shift, and enter state 86
(reduce using rule 61)
Rule 61 is this one:
ClassConstructor :
{- empty -} { ClassConstructorEmpty }
I am not really sure how to solve this problem. I tried using precedence rules to silence the warning, but it didn't work out as I expected.
Below is a simplified grammar which exhibits the same problem.
To construct it, I removed
all actions
the prefix "Class" from all nonterminal names
I also simplified most of the rules. I did this as an illustration of how you can construct a minimal, complete, verifiable example, as suggested by the StackOverflow guidelines, which makes it easier to focus on the problem while still permitting an actual trial. (I used bison, not happy, but the syntax is very similar.)
Block : "{" Attributes Constructor Functions "}"
Attributes : {- empty -} | Attributes Attribute
Constructor: {- empty -} | "constructor"
Functions : {- empty -} | Functions Function
Attribute : "[+]" "attribute"
Function : "[+]" "function"
Now, let's play parser, and suppose that we've (somehow) identified a prefix which could match Attributes. (Attributes can match the empty string, so we could be right at the beginning of the input.) And suppose the next token is [+].
At this point, we cannot tell if the [+] will later turn out to be the beginning of an Attribute or if it is the start of a Function which follows an empty Constructor. However, we need to know that in order to continue the parse.
If we've finished with the Attributes and about to start on the Functions, this is the moment where we have to reduce the empty nonterminal Constructor. Unless we do that now, we cannot then go on to recognize a Function. On the other hand, if we haven't seen the last Attribute but we do reduce a Constructor, then the parse will eventually fail because the next Attribute cannot follow the Constructor we just reduced.
In cases like this, it is often useful to remove the empty production by factoring the options into the places where the non-terminal is used:
Block : "{" Attributes "constructor" Functions "}"
| "{" Attributes Functions "}"
Attributes : {- empty -} | Attributes Attribute
Functions : {- empty -} | Functions Function
Attribute : "[+]" "attribute"
Function : "[+]" "function"
But just removing Constructor isn't enough here. In order to start parsing the list of Functions, we need to first reduce an empty Functions to provide the base case of the Functions recursion, so we still need to guess where the Functions start in order to find the correct parse. And if we wrote the two lists as right-recursions instead of left-recursions, we'd instead need an empty Attributes to terminate the recursion of the Attributes recursion.
What we could do, in this particular case, is use a cunning combination of left- and right-recursion:
Block : "{" Attributes "constructor" Functions "}"
| "{" Attributes Functions "}"
Attributes : {- empty -} | Attributes Attribute
Functions : {- empty -} | Function Functions
Attribute : "[+]" "attribute"
Function : "[+]" "function"
By making the first list left-recursive and the second list right-recursive, we avoid the need to reduce an empty non-terminal between the two lists. That, in turn, allows the parser to decide whether a phrase was an Attribute or a Function after it has seen the phrase, at which point it is no longer necessary to consult an oracle.
However, that solution is not very pretty for a number of reasons, not the least of which being that it only works for the concatenation of two optional lists. If we wanted to add another list of a different kind of item which could also start with the [+] token, a different solution would be needed..
The simplest one, which is used by a number of languages, is to allow the programmer to intermingle the various list elements. You might consider that bad style, but it is not always necessary to castigate bad style by making it a syntax error.
A simple solution would be:
Block : "{" Things "}"
Things : {- empty -} | Things Attribute | Things Function | Things Constructor
Attribute : "[+]" "attribute"
Constructor: "constructor"
Function : "[+]" "function"
but that doesn't limit a Block to at most one Constructor, which seems to be a syntactic requirement. However, as long as Constructor cannot start with a [+], you could implement the "at most one Constructor" limitation with:
Block : "{" Things Constructor Things "}"
| "{" Things "}"
Things : {- empty -} | Things Attribute | Things Function
Attribute : "[+]" "attribute"
Constructor: "constructor"
Function : "[+]" "function"

Range Specification in Xtext

I am new to XText and want to define a language element for specifying ranges of values. Examples: [1-2] or ]0.1-0.3[
I have the following rule for this purpose:
Range returns Expression:
Atomic (leftBracket=('[' | ']') left=Atomic '-' right=Atomic rightBracket=('[' | ']'))*;
Atomic here refers basically to the primitive float and int types. I have two problems:
I get the warning "The assigned value of feature 'leftBracket' will possibly override itself because it is used inside of a loop" and the same for rightBracket. What does this mean in this context?
The expression works only in standalone manner (in one row), but not in connection with the rest of the language elements. E.g. in connection with the element right before:
Comparison returns Expression:
Range ({Comparison.left=current} op=(">="|"<="|">"|"<"|"=>"|" <=>"|"xor"|"=") right=Range)*;
This means, if such an operation is before the Range element in my input of the second Eclipse window, I get the error "No viable alternative at input".
Any ideas? Thanks for any hints and advices!
Some more information:
I took this example and changed it: https://github.com/LorenzoBettini/packtpub-xtext-book-examples/blob/master/org.example.expressions/src/org/example/expressions/Expressions.xtext
Full code:
grammar org.example.expressions.Expressions with org.eclipse.xtext.common.Terminals
generate expressions "http://www.example.org/expressions/Expressions"
ExpressionsModel:
expressions+=Expression*;
Expression: Or;
Or returns Expression:
And ({Or.left=current} "||" right=And)*
;
And returns Expression:
Equality ({And.left=current} "&&" right=Equality)*
;
Equality returns Expression:
Comparison (
{Equality.left=current} op=("==")
right=Comparison
)*
;
Comparison returns Expression:
Range ({Comparison.left=current} op=(">="|"<="|">"|"<"|"=>"|"<=>"|"xor"|"=") right=Range)*
;
Range returns Expression:
Primary (leftBracket=('[' | ']') left=Primary '-' right=Primary rightBracket=('[' | ']'))*
;
Primary returns Expression:
'(' Expression ')' |
{Not} "!" expression=Primary |
Atomic
;
Atomic returns Expression:
{IntConstant} value=INT |
{StringConstant} value=STRING |
{BoolConstant} value=('true'|'false')
;
Example where it fails: (1 = [1-2]) however [1-2] in a row works fine.
i cannot really follow you but your grammar looks strange to me
Model:
(expressions+=Comparison ";")*;
Comparison returns Expression:
Range ({Comparison.left=current} op=(">=" | "<=" | ">" | "<" | "=>" | "<=>" | "xor" | "=") right=Range)*;
Range:
(leftBracket=('[' | ']') left=Atomic '-' right=Atomic rightBracket=('[' | ']'))
|
Atomic;
Atomic:
value=INT;
works fine with
[1-2];
]3-5[;
[1-4[ < ]1-6];
6;
1 < 2;
so can you give some more context

Dynamic operator precedence and associativity in ANTLR4?

i've been working on an antlr4 grammar for Z Notation (ISO UTF version), and the specification calls for a lex phase, and then a "2 phased" parse.
you first lex it into a bunch of NAME (or DECORWORD) tokens, and then you parse the resulting tokens against the operatorTemplate rules in the spec's parser grammar, replace appropriate tokens, and then finally parse your new modified token stream to get the AST.
i have the above working, but i can't figure out how to set the precedence and associativity of the parser rules dynamically, so the parse trees are wrong.
the operator syntax looks like (numbers are precedence):
-generic 5 rightassoc (_ → _)
-function 65 rightassoc (_ ◁ _)
i don't see any api to set the associativity on a rule, so i tried with semantic predicates, something like:
expression:
: {ZSupport.isLeftAssociative()}? expression I expression
| <assoc=right> expression i expression
;
or
expression:
: {ZSupport.isLeftAssociative()}? expression i expression
| <assoc=right> {ZSupport.isRightAssociative()}? expression I expression
;
but then i get "The following sets of rules are mutually left-recursive [expression]"
can this be done?
I was able to accomplish this by moving the semantic predicate:
expression:
: expression {ZSupport.isLeftAssociative()}? I expression
| <assoc=right> expression I expression
;
I was under the impression that this wasn't going to work based on this discussion:
https://stackoverflow.com/a/23677069/7711235
...but it does seem to work correctly in all my test cases...

Express a rule with ANTLR4

I must define a rule which expresses the following statement: {x in y | x > 0}.
For the first part of that comprehension "x in y", i have the subrule:
FIRSTPART: Name "in" Name
, whereas Name can be everything.
My problem is that I do not want a greedy behaviour. So, it should parse until the "|" sign and then stop. Since I am new in ANTLR4, I do not know how to achieve that.
best regards,
Normally, the lexer/parser rules should represent the allowable syntax of the source input stream.
The evaluation (and consequences) of how the source matches any rule or subrule is a matter of semantics -- whether the input matches a particular subrule and whether that should control how the rule is finally evaluated.
Normally, semantics are implemented as part of the tree-walker analysis. You can use alternate subrule lables (#inExpr, etc) to create easily distinguishable tree nodes for analysis purposes:
comprehension : LBrace expression property? RBrace ;
expression : ....
| Name In Name #inExpr
| Name BinOp Name #binExpr
| ....
;
property : Provided expression ;
BinOp : GT | LT | GTE | .... ;
Provided : '|' ;
In : 'in' ;

How can I write a grammar that matches "x by y by z of a"?

I'm designing a low-punctuation language in which I want to support the declaration of arrays using the following syntax:
512 by 512 of 255 // a 512x512 array filled with 255
100 of 0 // a 100-element array filled with 0
expr1 by expr2 by expr3 ... by exprN of exprFill
These array declarations are just one kind of expression among many.
I'm having a hard time figuring out how to write the grammar rules. I've simplified my grammar down to the simplest thing that reproduces my trouble:
grammar Dimensions;
program
: expression EOF
;
expression
: expression (BY expression)* OF expression
| INT
;
BY : 'by';
OF : 'of';
INT : [0-9]+;
WHITESPACE : [ \t\n\r]+ -> skip;
When I feed in 10 of 1, I get the parse I want:
When I feed in 20 by 10 of 1, the middle expression non-terminal slurps up the 10 of 1, leaving nothing left to match the rule's OF expression:
And I get the following warning:
line 2:0 mismatched input '<EOF>' expecting 'of'
The parse I'd like to see is
(program (expression (expression 20) by (expression 10) of (expression 1)) <EOF>)
Is there a way I can reformulate my grammar to achieve this? I feel that what I need is right-association across both BY and OF, but I don't know how to express this across two operators.
After some non-intellectual experimentation, I came up with some productions that seem to generate my desired parse:
expression
:<assoc=right> expression (BY expression)+ OF expression
|<assoc=right> expression OF expression
| INT
;
I don't know if there's a way I can express it with just one production.

Resources