Getting context name when using Labels for alternative subrules - antlr4

ANTLR Version: 4.11.1
Grammar: cpp/CPP14Parser.g4 in https://github.com/antlr/grammars-v4/
ANTLR Target Language: C++
I modified the selectionStatement rule
From
If LeftParen condition RightParen statement (Else statement)?
| Switch LeftParen condition RightParen statement;
To
If LeftParen condition RightParen statement (Else statement)? # ifStatement
| Switch LeftParen condition RightParen statement # switchStatement
;
Just added labels to the alternatives.
In my ParserListener, using enterEveryRule and exitEveryRule, I create a list of entered and exited context names. The sytnax used to find context name is
parser.getRuleNames()[context->getRuleIndex()]
From the above list, in exitIfStatement function, I am unable to locate "ifStatement", but only "selectionStatement". What change do I need to the find-context-name syntax to get "ifStatement"?

Related

Formatting string in Powershell but only first or specific occurrence of replacement token

I have a regular expression that I use several times in a script, where a single word gets changed but the rest of the expression remains the same. Normally I handle this by just creating a regular expression string with a format like the following example:
# Simple regex looking for exact string match
$regexTemplate = '^{0}$'
# Later on...
$someString = 'hello'
$someString -match ( $regexTemplate -f 'hello' ) # ==> True
However, I've written a more complex expression where I need to insert a variable into the expression template and... well regex syntax and string formatting syntax begin to clash:
$regexTemplate = '(?<=^\w{2}-){0}(?=-\d$)'
$awsRegion = 'us-east-1'
$subRegion = 'east'
$awsRegion -match ( $regexTemplate -f $subRegion ) # ==> Error
Which results in the following error:
InvalidOperation: Error formatting a string: Index (zero based) must be greater than or equal to zero and less than the size of the argument list.
I know what the issue is, it's seeing one of my expression quantifiers as a replacement token. Rather than opt for a string-interpolation approach or replace {0} myself, is there a way I can tell PowerShell/.NET to only replace the 0-indexed token? Or is there another way to achieve the desired output using format strings?
If a string template includes { and/or } characters, you need to double these so they do not interfere with the numbered placeholders.
Try
$regexTemplate = '(?<=^\w{{2}}-){0}(?=-\d$)'

Why am I being told that my 'then' and 'else' expressions are not the same type?

I used an if-then-else statement to define a variable in a query join on IBM Cognos Report Studio 10.2.2. In the 'then' clause, I use the hard-coded string 'Not reportable'. For the 'else' clause, I use the variable [Generational Distribution], which is defined elsewhere in the query join, and is a string-valued variable from one of the joining query. I would therefore expect that the 'then' and 'else' clauses are both string-valued.
However, when I run, I get the following error:
OP-ERR-0206 Unsupported 'if' expression dataItem = "Generation Reportable." The 'then'
(expression = "'Not reportable'") and 'else' (expression = "[Generational Distribution]")
clauses must have the same data type.
The details begin:
RSV-SRV-0042 Trace back:RSReportService.cpp(724): QFException: CCL_CAUGHT:
I tried fixing the problem by changing the else clause to trim(cast([Generational Distribution],char(15))). The report now runs fine, but something else strange happens. The item appears as 'Boomers' for every case where the 'if' clause is false, while there are also 'Millennials' and 'Gen X' generations.

Antl4 no rule index for labelled rules

For the grammar snippet from Java.g4,
statement
: block # blockStmt
| 'if' parExpression statement ('else' statement)? # ifStmt
| 'for' '(' forControl ')' statement # forStmt
| 'while' parExpression statement # whileStmt
;
All the alternatives are labelled.
I can get all StatementContext objects using this method
Trees.getAllRuleNodes(root,JavaParser.Rule_statement);
But if I am only interested in getting the IfStmtContext objects, how can I use the above method without using something like this
for(ParseTree tree : statementContextList)
{
if(tree instanceof IfStmtContext)
{
//add to a list
}
The generated JavaParser doesnt create rule indexes for labelled rules.
Do I have to customize the grammar in some way to make them indexed?
Or there is another ways do this?
My code should be fast and I need to remove as much as iterations and conditions as possible. Need to get rid of the 'instanceof' checks as well as possible

antlr4 empty alternative not working as expected

I'd presumed that in general the rule
rule: ( something ? ) ;
could generally be expressed as alternation with nothing, with identical semantics
rule: ( something | ) ; <-- empty alt here
(provided of course 'something' is a single item or bracketed to make it so). It seems obviously correct but antlr4 isn't having it. This code does as I expect
version 1, works
opt_cursor_into_spec :
( cursor_into_spec ? )
;
cursor_into_spec :
INTO
sident ( COMMA sident ) *
;
but this doesn't; failing to parse the input:
version 2, fails
opt_cursor_into_spec : // this rule's changed
cursor_into_spec
|
// empty alt
;
cursor_into_spec : // this is the same
INTO
sident ( COMMA sident ) *
;
Here's part of the diagnostics trace on version 2, note the [***]
consume [#1,8:11='crsr',<483>,2:6] rule regular_ident
exit regular_ident, LT(1)=<EOF>
exit sident, LT(1)=<EOF>
exit cic_cursor_name, LT(1)=<EOF>
exit cursor_ident_clause, LT(1)=<EOF>
enter opt_cursor_into_spec, LT(1)=<EOF>
line 4:0 no viable alternative at input '<EOF>' [***]
exit opt_cursor_into_spec, LT(1)=<EOF>
exit fetch_statement, LT(1)=<EOF>
exit sql_item, LT(1)=<EOF>
enter opt_sql_separators, LT(1)=<EOF>
exit opt_sql_separators, LT(1)=<EOF>
exit sql_items, LT(1)=<EOF>
This is odd as at *** it claims no viable alternative, but at the line before it says it's entered into opt_cursor_into_spec, but this rule has the empty alternative, which surely always matches - one can always match the empty string, I thought?
So is my assumption of this equivalence...
( x ? ) === ( x | <<<nothing>>> )
...incorrect, or what?
This Q isn't about code, but about my understanding of semantics. If anyone thinks these should do the same, I'll try to post reproducible code.
Edit: More confused now. A stripped down grammar didn't reproduce. Something about the end of file was suspicious as the input to parse is just fetch a and it seems to get parsed in full according to the diagnostics trace, then fails. Hmm. I added an explicit EOF to the starting rule, so (a bit simplified)
sql_items : sql_item * ; // ORIGINAL
became
sql_items : sql_item * EOF; // NEW
And both (x? and x|<<<nothing>>>) suddenly work for NEW. Previously only x? worked for ORIGINAL.
Adding an EOF test should surely not cause a previously unsuccessful parse to succeed, can it?
Edit 3: edit 2 struck as it was misleading and unhelpful
Edit 2: on reflection adding EOF to the grammar can of course cause a previously successful parse to fail, as an input can be well-formed at the start but malformed as a whole (ie. imagine parsing an expression 2 + 3 £$%&, the start is valid but overall it's crud) but that's not apparently what's happening here.
In version 1, the rule opt_cursor_into_spec matches iff the rule cursor_into_spec matches. In version 2, the rule opt_cursor_into_spec will always be matched. So the semantics of the grammar, specifically due to rules that have opt_cursor_into_spec as an element, will differ.
Likely, in version 2, you are getting a compile time warning about a rule that can match anything. You cannot ignore the warning unless you really understand the cause and effect.

how to handle conditionally existing components in action code?

This is another problem I am facing while migrating from antlr3 to antlr4. This problem is with the java action code for handling conditional components of rules. One example is shown below.
The following grammar+code worked in antlr3. Here, if the unary operator is not present, then a value of '0' is returned, and the java code checks for this value and takes appropriate action.
exprUnary returns [Expr e]
: (unaryOp)? e1=exprAtom
{if($unaryOp.i==0) $e = $e1.e;
else $e = new ExprUnary($unaryOp.i, $e1.e);
}
;
unaryOp returns [int i]
: '-' {$i = 1;}
| '~' {$i = 2;}
;
In antlr4, this code results in a null pointer exception during a run, because 'unaryOp' is 'null' if it is not present. But if I change the code like below, then antlr generation itself reports an error:
if($unaryOp==null) ...
java org.antlr.v4.Tool try.g4
error(67): missing attribute access on rule reference 'unaryOp' in '$unaryOp'
How should the action be coded for antlr4?
Another example of this situation is in if-then-[else] - here $s2 is null in antlr4:
ifStmt returns [Stmt s]
: 'if' '(' e=cond ')' s1=stmt ('else' s2=stmt)?
{$s = new StmtIf($e.e, $s1.s, $s2.s);}
;
NOTE: question 16392152 provides a solution to this question with listeners, but I am not using listeners, my requirement is for this to be handled in the action code.
There are at least two potential ways to correct this:
The "ANTLR 4" way to do it is to create a listener or visitor instead of placing the Java code inside of actions embedded in the grammar itself. This is the only way I would even consider solving the problem in my own grammars.
If you still use an embedded action, the most efficient way to check if the item exists or not is to access the ctx property, e.g. $unaryOp.ctx. This property resolves to the UnaryOpContext you were assuming would be accessible by $unaryOp by itself.
ANTLR expects you access an attribute. Try its text attribute instead: $unaryOp.text==null

Resources