Why does float('1.5') gives 1.5 as output as expected but int('1.5') gives a value error?
Shouldn't python automatically convert the string into float and then into integer.
Because 1.5 isn't a valid integer literal which is required by the int() function.
From the docs:
If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in radix
base.
Whereas integer literals are defined as follows:
integer ::= decinteger | bininteger | octinteger | hexinteger
decinteger ::= nonzerodigit (["_"] digit)* | "0"+ (["_"] "0")*
bininteger ::= "0" ("b" | "B") (["_"] bindigit)+
octinteger ::= "0" ("o" | "O") (["_"] octdigit)+
hexinteger ::= "0" ("x" | "X") (["_"] hexdigit)+
nonzerodigit ::= "1"..."9"
digit ::= "0"..."9"
bindigit ::= "0" | "1"
octdigit ::= "0"..."7"
hexdigit ::= digit | "a"..."f" | "A"..."F"
Source: https://docs.python.org/3/reference/lexical_analysis.html#integers
Related
I created EBNF for expressions below
<expression> ::= <or_operand> [ "or" <or_operand> ]
<or_operand> ::= <and_operand> [ "and" <and_operand> ]
<and_operand> ::= <equality_operand> [ ( "=" | "!=" ) <equality_operand> ]
<equality_operand> ::= <simple_expression> [ <relational_operator> <simple_expression> ]
<relational_op> ::= "<" | ">" | "<=" | ">="
<simple_expression> ::= <term> [ ( "+" | "-" ) <term> ]
<term> ::= <factor> [ ( "*" | "/" ) <factor> ]
<factor> ::= <literal>
| "(" <expression> ")"
| "not" <factor>
| ( "+" | "-" ) <factor>
<literal> ::= <boolean_literal> | <number>
<boolean_literal> ::= "true" | "false"
<number> ::= <digit> [ <digit> ]
<digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
My problem lies within the factor section
<factor> ::= <literal>
| "(" <expression> ")"
| "not" <factor>
| ( "+" | "-" ) <factor>
You can see that I included three unary operators not, -, and +. They work for specific types like not only applies to boolean values only and +/- applies to numbers only.
I don't know how to handle cases when not is mixed with +/- like not +7, - not true, etc. for example. Is there any way I can modify the grammar so not can never be mixed with +/-?
Will it suffice?
<factor> ::= <literal>
| "(" <expression> ")"
| ( "not" | ( "+" | "-" ) ) <factor>
Or maybe it's parser's job to solve this issue?
This is quite easy to solve. You have two different styles of expressions each with their own syntax, so you just do not mix them, and keep their syntax rules separated.
A boolean expression can only occur in certain places, such as an assignment or some kind of choice statement. A numerical expression can only occur in a comparison or an assignment. This is not something that is handled at the semantic level, and if one looks at the grammar for many languages, this is how it is solved.
So you have for numeric expressions:
<simple_expression> ::= <term> [ ( "+" | "-" ) <term> ]
<term> ::= <factor> [ ( "*" | "/" ) <factor> ]
<factor> ::= <number>
| "(" <expression> ")"
| ( "+" | "-" ) <factor>
<number> ::= <digit> [ <digit> ]
<digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
This is now self contained. We can now build this into boolean expressions:
<boolean_expression> ::= "not" <boolean_expression>
| <logical_expression>
<logical_expression> ::= <or_operand> [ "or" <or_operand> ]
<or_operand> ::= <and_operand> [ "and" <and_operand> ]
<and_operand> ::= <equality_operand> [ ( "=" | "!=" ) <equality_operand> ]
<equality_operand> ::= <simple_expression> [ <relational_operator> <simple_expression> ]
| <boolean_literal>
<relational_op> ::= "<" | ">" | "<=" | ">="
<boolean_literal> ::= "true" | "false"
Notice that I permitted the equality comparison of boolean literals, however if you did not want to permit this you could change the rules to only permit them for an and operand.
Now we can use these in another rule, such as assignment:
<assignment> ::= <variable> ":=" ( <simple_expression> | <boolean_expression> )
All is done.
I have successfully split my expressions into arithmetic and boolean expressions like this:
/* entry point */
parse: formula EOF;
formula : (expr|boolExpr);
/* boolean expressions : take math expr and use boolean operators on them */
boolExpr
: bool
| l=expr operator=(GT|LT|GEQ|LEQ) r=expr
| l=boolExpr operator=(OR|AND) r=boolExpr
| l=expr (not='!')? EQUALS r=expr
| l=expr BETWEEN low=expr AND high=expr
| l=expr IS (NOT)? NULL
| l=atom LIKE regexp=string
| l=atom ('IN'|'in') '(' string (',' string)* ')'
| '(' boolExpr ')'
;
/* arithmetic expressions */
expr
: atom
| (PLUS|MINUS) expr
| l=expr operator=(MULT|DIV) r=expr
| l=expr operator=(PLUS|MINUS) r=expr
| function=IDENTIFIER '(' (expr ( ',' expr )* ) ? ')'
| '(' expr ')'
;
atom
: number
| variable
| string
;
But now I have HUGE performance problems. Some formulas I try to parse are utterly slow, to the point that it has become unbearable: more than an hour (I stopped at that point) to parse this:
-4.77+[V1]*-0.0071+[V1]*[V1]*0+[V2]*-0.0194+[V2]*[V2]*0+[V3]*-0.00447932+[V3]*[V3]*-0.0017+[V4]*-0.00003298+[V4]*[V4]*0.0017+[V5]*-0.0035+[V5]*[V5]*0+[V6]*-4.19793004+[V6]*[V6]*1.5962+[V7]*12.51966636+[V7]*[V7]*-5.7058+[V8]*-19.06596752+[V8]*[V8]*28.6281+[V9]*9.47136506+[V9]*[V9]*-33.0993+[V10]*0.001+[V10]*[V10]*0+[V11]*-0.15397774+[V11]*[V11]*-0.0021+[V12]*-0.027+[V12]*[V12]*0+[V13]*-2.02963068+[V13]*[V13]*0.1683+[V14]*24.6268688+[V14]*[V14]*-5.1685+[V15]*-6.17590512+[V15]*[V15]*1.2936+[V16]*2.03846688+[V16]*[V16]*-0.1427+[V17]*9.02302288+[V17]*[V17]*-1.8223+[V18]*1.7471106+[V18]*[V18]*-0.1255+[V19]*-30.00770912+[V19]*[V19]*6.7738
Do you have any idea on what the problem is?
The parsing stops when the parser enters the formula grammar rule.
edit Original problem here:
My grammar allows this:
// ( 1 LESS_EQUALS 2 )
1 <= 2
But the way I expressed it in my G4 file makes it also accept this:
// ( ( 1 LESS_EQUALS 2 ) LESS_EQUALS 3 )
1 <= 2 <= 3
Which I don't want.
My grammar contains this:
expr
: atom # atomArithmeticExpr
| (PLUS|MINUS) expr # plusMinusExpr
| l=expr operator=('*'|'/') r=expr # multdivArithmeticExpr
| l=expr operator=('+'|'-') r=expr # addsubtArithmeticExpr
| l=expr operator=('>'|'<'|'>='|'<=') r=expr # comparisonExpr
[...]
How can I tell Antlr that this is not acceptable?
Just split root into two. Either rename root 'expr' into 'rootexpr', or vice versa.
rootExpr
: atom # atomArithmeticExpr
| (PLUS|MINUS) expr # plusMinusExpr
| l=expr operator=('*'|'/') r=expr # multdivArithmeticExpr
| l=expr operator=('+'|'-') r=expr # addsubtArithmeticExpr
| l=expr operator=('>'|'<'|'>='|'<=') r=expr # comparisonExpr
EDIT: You cannot have cyclic reference => expr node in expr rule.
I am using xtext to define a grammar.
I have a problem with the runtime syntax evaluation.
the rule is signatureDeclaration. Here the full grammar:
// automatically generated by Xtext
grammar org.xtext.alloy.Alloy with org.eclipse.xtext.common.Terminals
import "http://fr.cuauh.als/1.0"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
//specification ::= [module] open* paragraph*
//ok
Specification returns Specification:
{Specification}
(module=Module)?
(opens+=Library (opens+=Library)*)?
(paragraphs+=Paragraph (paragraphs+=Paragraph)*)?;
//module ::= "module" name [ "[" ["exactly"] name ("," ["exactly"] num)* "]" ]
//module ::= "module" name? [ "[" ["exactly"] name ("," ExactlyNum )* "]" ]
//ok
Module returns Module:
{Module}
'module' (name=IDName)? ('['(exactly?='exactly')? extensionName=[IDref] (nums+=ExactlyNums ( "," nums+=ExactlyNums)*)?']')?
;
IDName returns IDName:
name=ID
;
terminal ID : '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'/')*;
//open ::= ["private"] "open" name [ "[" ref,+ "]" ] [ "as" name ]
//open ::= ["private"] "open" path [ "[" ref,+ "]" ] [ "as" name ]
//ok
Library returns Library:
{Library}
(private?='private')? 'open' path=EString ('['references+=Reference (',' references+=Reference)*']')? ('as' alias=Alias)?
;
//a path
//ok
//terminal PATH returns ecore::EString :
// ('a'..'z'|'A'..'Z'|'_'|'.')+('/'('a'..'z'|'A'..'Z'|'_'|'.')+)*
//;
//paragraph ::= factDecl | assertDecl | funDecl | cmdDecl | enumDecl | sigDecl
//paragraph ::= factDecl | assertDecl | funDecl | predDecl | cmdDecl | enumDecl | sigDecl
//ok
Paragraph returns Paragraph:
FactDeclaration | AssertDeclaration | FunctionDeclaration | PredicatDeclaration | CmdDeclaration | EnumerationDeclaration | SignatureDeclaration;
//cmdDecl ::= [name ":"] ("run"|"check") (name|block) scope
//cmdDecl ::= [name ":"] command (ref|block) scope ["expect (0|1)"]
//ok
CmdDeclaration returns CmdDeclaration:
(name=IDName ':')? operation=cmdOp referenceOrBlock=ReferenceOrBlock (scope=Scope)? (expect?='expect' expectValue=EInt)?;
//ok
ReferenceOrBlock returns ReferenceOrBlock:
BlockExpr | ReferenceName;
//sigDecl ::= sigQual* "sig" name,+ [sigExt] "{" decl,* "}" [block]
//sigDecl ::= ["private"] ["abstract"] [quant] "sig" name [sigExt] "{" relDecl,* "}" [block]
//ok
SignatureDeclaration returns SignatureDeclaration:
{SignatureDeclaration}
(isPrivate?='private')? (isAbstract?='abstract')? (quantifier=SignatureQuantifier)? 'sig' name=IDName (extension=SignatureExtension)? '{'
(relations+=RelationDeclaration ( ',' =>relations+=RelationDeclaration)* )?
'}'
(block=Block)?;
//ok
SignatureExtension returns SignatureExtension:
SignatureinInheritance | SignatureInclusion;
TypeScopeTarget returns TypeScopeTarget:
Int0 | Seq | ReferenceName;
//name ::= ("this" | ID) ["/" ID]*
//ok do not need to be part of the concrete syntax
//Name returns Name:
// thisOrId=ThisOrID ('/'ids+=ID0( "/" ids+=ID0)*)?;
//ok do not need to be part of the concrete syntax
//ThisOrID returns ThisOrID:
// ID0 | This;
EBoolean returns ecore::EBoolean:
'true' | 'false';
//["exactly"] num
//ok
ExactlyNums returns ExactlyNums:
exactly?='exactly' num=Number;
//ok do not need to be part of the concrete syntax
// IDName | IDref;
IDref returns IDref:
namedElement=[IDName]
;
This returns This:
{This}
'this'
;
EString returns ecore::EString:
STRING | ID;
//ok
EInt returns ecore::EInt:
'-'? INT;
Alias returns Alias:
{Alias}
name=IDName;
//factDecl ::= "fact" [name] block
//ok
FactDeclaration returns FactDeclaration:
'fact' (name=IDName)? block=Block;
//assertDecl ::= "assert" [name] block
//ok
AssertDeclaration returns AssertDeclaration:
'assert' (name=IDName)? block=Block ;
//funDecl ::= ["private"] "fun" [ref "."] name "(" decl,* ")" ":" expr block
//funDecl ::= ["private"] "fun" [ref "."] name "[" decl,* "]" ":" expr block
//funDecl ::= ["private"] "fun" [ref "."] name ":" expr block
//
//funDecl ::= ["private"] "fun" [ref "."] name ["(" paramDecl,* ")" | "[" paramDecl,* "]" ] ":" expr block
//ok
FunctionDeclaration returns FunctionDeclaration:
(private?='private')? 'fun' (reference=[Reference] ".")? name=IDName
('(' (parameters+=ParameterDeclaration ( "," parameters+=ParameterDeclaration)*)? ')'|
'[' (parameters+=ParameterDeclaration ( "," parameters+=ParameterDeclaration)*)? ']')?
':' ^returns=Expression
block=Block;
//funDecl ::= ["private"] "pred" [ref "."] name "(" decl,* ")" block
//funDecl ::= ["private"] "pred" [ref "."] name "[" decl,* "]" block
//funDecl ::= ["private"] "pred" [ref "."] name block
//
//predDecl ::= ["private"] "pred" [ref "."] name ["(" paramDecl,* ")" | "[" paramDecl,* "]" ] ":" block
//ok
PredicatDeclaration returns PredicatDeclaration:
(private?='private')? 'pred' (reference=[Reference|EString] ".")? name=IDName
('(' (parameters+=ParameterDeclaration ( "," parameters+=ParameterDeclaration)*)? ')'|
'[' (parameters+=ParameterDeclaration ( "," parameters+=ParameterDeclaration)*)? ']')?
block=Block;
//enumDecl ::= "enum" name "{" name ("," name)* "}"
//enumDecl ::= "enum" name "{" enumEl ("," enumEl)* "}"
//ok
EnumerationDeclaration returns EnumerationDeclaration:
'enum' name=IDName '{' enumeration+=EnumerationElement ( "," enumeration+=EnumerationElement)* '}';
//ok
EnumerationElement returns EnumerationElement:
{EnumerationElement}
name=IDName;
//"lone" | "one" | "some"
//ok
enum SignatureQuantifier returns SignatureQuantifier:
lone = 'lone' | one = 'one' | some = 'some';
//decl ::= ["private"] ["disj"] name,+ ":" ["disj"] expr
//decl ::= ["private"] ["disj"] name,+ ":" ["disj"] expr
//ok
RelationDeclaration returns RelationDeclaration:
(isPrivate?='private')? (varsAreDisjoint?='disj')? names+=VarDecl (',' names+=VarDecl)* ':' (expressionIsDisjoint?='disj')? expression=Expression;
//sigExt ::= "extends" ref
//ok
SignatureinInheritance returns SignatureinInheritance:
'extends' extends=Reference;
//sigExt ::= "in" ref ["+" ref]*
//ok
SignatureInclusion returns SignatureInclusion:
'in' includes+=Reference ( "+" includes+=Reference)* ;
//ok
VarDecl returns VarDecl:
{VarDecl}
name=IDName;
//decl ::= ["private"] ["disj"] name,+ ":" ["disj"] expr
//ok
ParameterDeclaration returns ParameterDeclaration:
(isPrivate?='private')? (varsAreDisjoint?='disj')? names+=VarDecl ( "," names+=VarDecl)* ':' (expressionIsDisjoint?='disj')? expression=Expression;
//("run"|"check")
//ok
enum cmdOp returns cmdOp:
run = 'run' | check = 'check';
//expr ::=
//1) "let" letDecl,+ blockOrBar
//2) | quant decl,+ blockOrBar
//3) | unOp expr
//4) | expr binOp expr
//5) | expr arrowOp expr
//6) | expr ["!"|"not"] compareOp expr
//7) | expr ("=>"|"implies") expr "else" expr
//8) | expr "[" expr,* "]"
//9) | number
//10) | "-" number
//11) | "none"
//12) | "iden"
//13) | "univ"
//14) | "Int"
//15) | "seq/Int"
//16) | "(" expr ")"
//17) | ["#"] Name
//18) | block
//19) | "{" decl,+ blockOrBar "}"
// expr ::= leftPart [rightPart]
Expression returns Expression:
lhs=NonLeftRecursiveExpression (=>parts=NaryPart)?;
//4) | expr binOp expr
//5) | expr arrowOp expr
//6) | expr ["!"|"not"] compareOp expr
//7) | expr ("=>"|"implies") expr "else" expr
//8) | expr "[" expr,* "]"
//ok
NaryPart returns NaryPart:
BinaryOrElsePart | CallPart;
//4) | expr binOp expr
//5) | expr arrowOp expr
//6) | expr ["!"|"not"] compareOp expr
//7) | expr ("=>"|"implies") expr "else" expr
//
//7) | expr ("=>"|"implies") expr "else" expr
//4)5)6) | expr binaryOperator expr*
//ok
BinaryOrElsePart returns BinaryOrElsePart:
=>('=>'|'implies') rhs=Expression (=>'else' else=Expression)? |
operator=BinaryOperator rhs=Expression ;
//8) | expr "[" expr,* "]"
//it is just the right part
//ok
CallPart returns CallPart:
{CallPart}
'['(params+=Expression ( "," params+=Expression)*)?']';
//1) "let" letDecl,+ blockOrBar
//2) | quant decl,+ blockOrBar
//19) | "{" decl,+ blockOrBar "}"
//18) | block
// | terminalExpression
NonLeftRecursiveExpression returns NonLeftRecursiveExpression:
LetExpression | QuantifiedExpression | CurlyBracketsExpression | BlockExpr | TerminalExpression;
//1) "let" letDecl,+ blockOrBar
//ok
LetExpression returns LetExpression:
'let' letDeclarations+=LetDeclaration ( "," letDeclarations+=LetDeclaration)* blockOrBar=BlockOrBar;
//2) | quant decl,+ blockOrBar
//ok
QuantifiedExpression returns QuantifiedExpression:
quantifier=QuantifiedExpressionQuantifier varDeclaration+=VarDeclaration ( "," varDeclaration+=VarDeclaration)* blockOrBar=BlockOrBar;
//
//binOp ::= "||" | "or" | "&&" | "and" | "&" | "<=>" | "iff" | "=>" | "implies" | "+" | "-" | "++" | "<:" | ":>" | "." | "<<" | ">>" | ">>>"
//compareOp ::= "=" | "in" | "<" | ">" | "=<" | ">="
//arrowOp ::= ["some"|"one"|"lone"|"set"] "->" ["some"|"one"|"lone"|"set"]
//ok
BinaryOperator returns BinaryOperator:
RelationalOperator | CompareOperator | ArrowOperator;
//ok
RelationalOperator returns RelationalOperator:
operator=RelationalOp;
//binOp ::= "||" | "or" | "&&" | "and" | "&" | "<=>" | "iff" | "=>" | "implies" | "+" | "-" | "++" | "<:" | ":>" | "." | "<<" | ">>" | ">>>"
//ok
enum RelationalOp returns RelationalOp:
or = '||' | and = '&&' | union = '+' | intersection = '&' | difference = '-' | equivalence = '<=>' | override = '++'
| domain = '<:' | range = ':>' | join = '.' ; // | lshift = '<<' | rshift = '>>' | rrshift = '>>>';
//["!"|"not"] compareOp
//ok
CompareOperator returns CompareOperator:
(negated?='!' | negated?='not')? operator=CompareOp;
//compareOp ::= "=" | "in" | "<" | ">" | "=<" | ">="
enum CompareOp returns CompareOp:
equal = '=' | inclusion = 'in' | lesser = '<' | greater = '>' | lesserOrEq = '<=' | greaterOrEq = '>=';
//arrowOp ::= ["some"|"one"|"lone"|"set"] "->" ["some"|"one"|"lone"|"set"]
//ok
ArrowOperator returns ArrowOperator:
{ArrowOperator}
(leftQuantifier=ArrowQuantifier)? '->' (=>rightQuantifier=ArrowQuantifier)?;
//"some"|"one"|"lone"|"set"
//ok
enum ArrowQuantifier returns ArrowQuantifier:
lone = 'lone' | one = 'one' | some = 'some' | set = 'set' ;
//19) | "{" decl,+ blockOrBar "}"
//ok
CurlyBracketsExpression returns CurlyBracketsExpression:
'{' varDeclarations+=VarDeclaration ( "," varDeclarations+=VarDeclaration)* blockOrBar=BlockOrBar '}';
//blockOrBar ::= block
//blockOrBar ::= "|" expr
//ok
BlockOrBar returns BlockOrBar:
BlockExpr | Bar;
//blockOrBar ::= "|" expr
//ok
Bar returns Bar:
'|' expression=Expression;
//block ::= "{" expr* "}"
//ok
BlockExpr returns BlockExpr:
{BlockExpr}
'{' (expressions+=Expression ( "," expressions+=Expression)*)?'}';
//3) unOp expr
// | finalExpression
TerminalExpression returns TerminalExpression:
UnaryExpr | finalExpression ;
//3) unOp expr
//ok
UnaryExpr returns UnaryExpr:
unOp=UnaryOperator expression=TerminalExpression;
//unOp ::= "!" | "not" | "no" | "some" | "lone" | "one" | "set" | "seq" | "#" | "~" | "*" | "^"
//unOp ::= "!" | "not" |"#" | "~" | "*" | "^"
//ok
enum UnaryOperator returns UnaryOperator:
not = 'not' | card = '#' | transpose = '~' | reflexiveClosure = '*' | closure = '^' | not2 = '!';
//16) | "(" expr ")"
//9) | number
//10) | "-" number
//17) | ["#"] Name
//11) | "none"
//12) | "iden"
//13) | "univ"
//14) | "Int"
//15) | "seq/Int"
//16) | "(" expr ")"
//10) | ["-"] number
//17) | "#" Name
//17) | reference
//12)13) | constante
finalExpression returns TerminalExpression:
BracketExpression | NumberExpression | NotExpandedExpression | ReferenceExpression | ConstantExpression;
//16) | "(" expr ")"
//ok
BracketExpression returns BracketExpression:
'('expression=Expression')';
//9) | number
//10) | "-" number
//ok
Number returns Number:
NumberExpression;
//ok
NumberExpression returns NumberExpression:
value=EInt;
//17) | ["#"] Name
//17) | "#" Name
//ok
NotExpandedExpression returns NotExpandedExpression:
'#' name=[IDref];
//ok
ReferenceExpression returns ReferenceExpression:
reference=Reference;
//ref ::= name | "univ" | "Int" | "seq/Int"
//ok
Reference returns Reference:
ReferenceName | ConstanteReference;
//ok
ConstanteReference returns ConstanteReference:
cst=constanteRef;
//13) | "univ"
//14) | "Int"
//15) | "seq/Int"
//ok
enum constanteRef returns constanteRef:
int = 'Int' | seqint = 'seq/Int' | univ = 'univ';
//ok
ReferenceName returns ReferenceName:
name=IDref;
//ok
ConstantExpression returns ConstantExpression:
constante=Constant;
//11) | "none"
//12) | "iden"
//ok
enum Constant returns Constant:
none = 'none' | iden = 'iden';
//ok
Block returns Block:
BlockExpr;
//letDecl ::= name "=" expr
//ok
LetDeclaration returns LetDeclaration:
varName=VarDecl '=' expression=Expression;
//quant ::= "all" | "no" | "some" | "lone" | "one" | "sum"
//quant ::= "all" | "no" | "some" | "lone" | "one" | "null"
//ok
enum QuantifiedExpressionQuantifier returns QuantifiedExpressionQuantifier:
no = 'no' | one = 'one' | lone = 'lone' | some = 'some' | all = 'all' | null = 'null';
//decl ::= ["private"] ["disj"] name,+ ":" ["disj"] expr
//ok
VarDeclaration returns VarDeclaration:
(isPrivate?='private')? (varsAreDisjoint?='disj')? names+=VarDecl ( "," names+=VarDecl)* ':' (expressionIsDisjoint?='disj')? expression=Expression;
//scope ::= "for" number ["expect" (0|1)]
//scope ::= "for" number "but" typescope,+ ["expect" (0|1)]
//scope ::= "for" typescope,+ ["expect" (0|1)]
//scope ::= ["expect" (0|1)]
//
//scope ::= "for" [number] ["but"] typescope,*
//ok
Scope returns Scope:
{Scope}
'for' (number=Number)? (but?='but')? (typeScope+=TypeScope ( "," typeScope+=TypeScope)*)?;
//typescope ::= ["exactly"] number [name|"int"|"seq"]
//typescope ::= ExactlyNumber target
TypeScope returns TypeScope:
num=ExactlyNums target=[TypeScopeTarget];
//[name|"int"|"seq"]
//[Validname|"int"|"seq"]
//ok
TypeScopeTarget_Impl returns TypeScopeTarget:
{TypeScopeTarget}
;
Int0 returns Int:
{Int}
'Int'
;
Seq returns Seq:
{Seq}
'Seq'
;
But at runtime of the editor, I have the following error on the little exemple :
sig A {}
sig B {
a : A
}
Multiple markers at this line (line a : A)
- no viable alternative at input 'A'
- missing EOF at '->'
- missing '}' at 'a'
the rule works for the first one, but not the second one. It is at is expect there is no relationDeclartion between the brackets.
I think it is due to the declaration of the rule relationDeclaration call with the form :
(relations+=RelationDeclaration ( ',' relations+=RelationDeclaration)*)?
I cannot get what is wrong.
What did I miss?
What can I do to make it work?
Thanks in advance.
in your grammar for RelationDeclaration it seems you miss some optional marking questionmarks (isPrivate?='private')? (varsAreDisjoint?='disj')? (the same problem is all over the grammar)
?= declares the attribute as optional, but only a ? around the rule call makes it actually optional
I have a very simple grammar to parse statements.
Here are examples of the type of statements that need be parsed:
a.b.c
a.b.c == "88"
The issue I am having is that array notation is not matching. For example, things that are not working:
a.b[0].c
a[3][4]
I hope someone can point out what I am doing wrong here. (I am testing in ANTLRWorks)
Here is the grammar (generationUnit is my entry point):
grammar RatBinding;
generationUnit: testStatement | statement;
arrayAccesor : identifier arrayNotation+;
arrayNotation: '[' Number ']';
testStatement:
(statement | string | Number | Bool )
(greaterThanAndEqual
| lessThanOrEqual
| greaterThan
| lessThan | notEquals | equals)
(statement | string | Number | Bool )
;
part: identifier | arrayAccesor;
statement: part ('.' part )*;
string: ('"' identifier '"') | ('\'' identifier '\'');
greaterThanAndEqual: '>=';
lessThanOrEqual: '<=';
greaterThan: '>';
lessThan: '<';
notEquals : '!=';
equals: '==';
identifier: Letter (Letter|Digit)*;
Bool : 'true' | 'false';
ArrayLeft: '\u005B';
ArrayRight: '\u005D';
Letter
: '\u0024' |
'\u0041'..'\u005a' |
'\u005f '|
'\u0061'..'\u007a' |
'\u00c0'..'\u00d6' |
'\u00d8'..'\u00f6' |
'\u00f8'..'\u00ff' |
'\u0100'..'\u1fff' |
'\u3040'..'\u318f' |
'\u3300'..'\u337f' |
'\u3400'..'\u3d2d' |
'\u4e00'..'\u9fff' |
'\uf900'..'\ufaff'
;
Digit
: '\u0030'..'\u0039' |
'\u0660'..'\u0669' |
'\u06f0'..'\u06f9' |
'\u0966'..'\u096f' |
'\u09e6'..'\u09ef' |
'\u0a66'..'\u0a6f' |
'\u0ae6'..'\u0aef' |
'\u0b66'..'\u0b6f' |
'\u0be7'..'\u0bef' |
'\u0c66'..'\u0c6f' |
'\u0ce6'..'\u0cef' |
'\u0d66'..'\u0d6f' |
'\u0e50'..'\u0e59' |
'\u0ed0'..'\u0ed9' |
'\u1040'..'\u1049'
;
WS : [ \r\t\u000C\n]+ -> channel(HIDDEN)
;
You referenced the non-existent rule Number in the arrayNotation parser rule.
A Digit rule does exist in the lexer, but it will only match a single-digit number. For example, 1 is a Digit, but 10 is two separate Digit tokens so a[10] won't match the arrayAccesor rule. You probably want to resolve this in two parts:
Create a Number token consisting of one or more digits.
Number
: Digit+
;
Mark Digit as a fragment rule to indicate that it doesn't form tokens on its own, but is merely intended to be referenced from other lexer rules.
fragment // prevents a Digit token from being created on its own
Digit
: ...
You will not need to change arrayNotation because it already references the Number rule you created here.
Bah, waste of space. I Used Number instead of Digit in my array declaration.
Is there a complete list of allowed characters somewhere, or a rule that determines what can be used in an identifier vs an operator?
From the Haskell report, this is the syntax for allowed symbols:
a | b means a or b and
a<b> means a except b
special -> ( | ) | , | ; | [ | ] | `| { | }
symbol -> ascSymbol | uniSymbol<special | _ | : | " | '>
ascSymbol -> ! | # | $ | % | & | * | + | . | / | < | = | > | ? | #
\ | ^ | | | - | ~
uniSymbol -> any Unicode symbol or punctuation
So, symbols are ASCII symbols or Unicode symbols except from those in special | _ | : | " | ', which are reserved.
Meaning the following characters can't be used: ( ) | , ; [ ] ` { } _ : " '
A few paragraphs below, the report gives the complete definition for Haskell operators:
varsym -> ( symbol {symbol | :})<reservedop | dashes>
consym -> (: {symbol | :})<reservedop>
reservedop -> .. | : | :: | = | \ | | | <- | -> | # | ~ | =>
Operator symbols are formed from one or more symbol characters, as
defined above, and are lexically distinguished into two namespaces
(Section 1.4):
An operator symbol starting with a colon is a constructor.
An operator symbol starting with any other character is an ordinary identifier.
Notice that a colon by itself, ":", is reserved solely for use as the
Haskell list constructor; this makes its treatment uniform with other
parts of list syntax, such as "[]" and "[a,b]".
Other than the special syntax for prefix negation, all operators are
infix, although each infix operator can be used in a section to yield
partially applied operators (see Section 3.5). All of the standard
infix operators are just predefined symbols and may be rebound.
From the Haskell 2010 Report §2.4:
Operator symbols are formed from one or more symbol characters...
§2.2 defines symbol characters as being any of !#$%&*+./<=>?#\^|-~: or "any [non-ascii] Unicode symbol or punctuation".
NOTE: User-defined operators cannot begin with a : as, quoting the language report, "An operator symbol starting with a colon is a constructor."
What I was looking for was the complete list of characters. Based on the other answers, the full list is;
Unicode Punctuation:
http://www.fileformat.info/info/unicode/category/Pc/list.htm
http://www.fileformat.info/info/unicode/category/Pd/list.htm
http://www.fileformat.info/info/unicode/category/Pe/list.htm
http://www.fileformat.info/info/unicode/category/Pf/list.htm
http://www.fileformat.info/info/unicode/category/Pi/list.htm
http://www.fileformat.info/info/unicode/category/Po/list.htm
http://www.fileformat.info/info/unicode/category/Ps/list.htm
Unicode Symbols:
http://www.fileformat.info/info/unicode/category/Sc/list.htm
http://www.fileformat.info/info/unicode/category/Sk/list.htm
http://www.fileformat.info/info/unicode/category/Sm/list.htm
http://www.fileformat.info/info/unicode/category/So/list.htm
But excluding the following characters with special meaning in Haskell:
(),;[]`{}_:"'
A : is only permitted as the first character of the operator, and denotes a constructor (see An operator symbol starting with a colon is a constructor).