how to handle conditionally existing components in action code? - antlr4

This is another problem I am facing while migrating from antlr3 to antlr4. This problem is with the java action code for handling conditional components of rules. One example is shown below.
The following grammar+code worked in antlr3. Here, if the unary operator is not present, then a value of '0' is returned, and the java code checks for this value and takes appropriate action.
exprUnary returns [Expr e]
: (unaryOp)? e1=exprAtom
{if($unaryOp.i==0) $e = $e1.e;
else $e = new ExprUnary($unaryOp.i, $e1.e);
}
;
unaryOp returns [int i]
: '-' {$i = 1;}
| '~' {$i = 2;}
;
In antlr4, this code results in a null pointer exception during a run, because 'unaryOp' is 'null' if it is not present. But if I change the code like below, then antlr generation itself reports an error:
if($unaryOp==null) ...
java org.antlr.v4.Tool try.g4
error(67): missing attribute access on rule reference 'unaryOp' in '$unaryOp'
How should the action be coded for antlr4?
Another example of this situation is in if-then-[else] - here $s2 is null in antlr4:
ifStmt returns [Stmt s]
: 'if' '(' e=cond ')' s1=stmt ('else' s2=stmt)?
{$s = new StmtIf($e.e, $s1.s, $s2.s);}
;
NOTE: question 16392152 provides a solution to this question with listeners, but I am not using listeners, my requirement is for this to be handled in the action code.

There are at least two potential ways to correct this:
The "ANTLR 4" way to do it is to create a listener or visitor instead of placing the Java code inside of actions embedded in the grammar itself. This is the only way I would even consider solving the problem in my own grammars.
If you still use an embedded action, the most efficient way to check if the item exists or not is to access the ctx property, e.g. $unaryOp.ctx. This property resolves to the UnaryOpContext you were assuming would be accessible by $unaryOp by itself.

ANTLR expects you access an attribute. Try its text attribute instead: $unaryOp.text==null

Related

Not able to use for loop in ternary operator in arangodb

How do we write conditions in arango, that includes for loops. I can elaborate the requirement below.
My requirement is if a particular attribute(array type) exists in the arango collection, i would read data from the collection(that requires a loop) or else, might do the following :
return null
return empty string ""
do nothing.
Is this possible to achieve in arango?
The helping methods could be -->
-- has(collectionname, attributename)
-- The ternary operator ?:
let attribute1 = has(doc,"attribute1") ?(
for name in doc.attribute1.names
filter name.language == "xyz"
return name.name
) : ""
But this dosent work. Seems like arango compiler first attempts to compile the for loop, finds nulls and reports error as below. Instead, it should have compiled "has" function first for the ternary operator being used.
collection or array expected as operand to FOR loop; you provided a value of type 'null' (while executing)
If there is a better way of doing it, would appreciate the advice!!
Thanks in advance!
Nilotpal
Fakhrany here from ArangoDB.
Regarding your question, this is a known limitation.
From https://www.arangodb.com/docs/3.8/aql/fundamentals-limitations.html:
The following other limitations are known for AQL queries:
Subqueries that are used inside expressions are pulled out of these
expressions and executed beforehand. That means that subqueries do not
participate in lazy evaluation of operands, for example in the ternary
operator. Also see evaluation of subqueries.
Also noted here for the ternary operator:
https://www.arangodb.com/docs/3.8/aql/operators.html#ternary-operator.
An answer to the question what to do may be to use a FILTER before enumerating over the attributes:
FOR doc IN collection
/* the following filter will only let those documents passed in which "attribute1.names" is an array */
FILTER IS_ARRAY(doc.attribute1.names)
FOR name IN doc.attribute1.names
FILTER name.language == "xyz"
RETURN name.name
Other solutions are also possible. Depends a bit on the use case.

Formatting string in Powershell but only first or specific occurrence of replacement token

I have a regular expression that I use several times in a script, where a single word gets changed but the rest of the expression remains the same. Normally I handle this by just creating a regular expression string with a format like the following example:
# Simple regex looking for exact string match
$regexTemplate = '^{0}$'
# Later on...
$someString = 'hello'
$someString -match ( $regexTemplate -f 'hello' ) # ==> True
However, I've written a more complex expression where I need to insert a variable into the expression template and... well regex syntax and string formatting syntax begin to clash:
$regexTemplate = '(?<=^\w{2}-){0}(?=-\d$)'
$awsRegion = 'us-east-1'
$subRegion = 'east'
$awsRegion -match ( $regexTemplate -f $subRegion ) # ==> Error
Which results in the following error:
InvalidOperation: Error formatting a string: Index (zero based) must be greater than or equal to zero and less than the size of the argument list.
I know what the issue is, it's seeing one of my expression quantifiers as a replacement token. Rather than opt for a string-interpolation approach or replace {0} myself, is there a way I can tell PowerShell/.NET to only replace the 0-indexed token? Or is there another way to achieve the desired output using format strings?
If a string template includes { and/or } characters, you need to double these so they do not interfere with the numbered placeholders.
Try
$regexTemplate = '(?<=^\w{{2}}-){0}(?=-\d$)'

Antl4 no rule index for labelled rules

For the grammar snippet from Java.g4,
statement
: block # blockStmt
| 'if' parExpression statement ('else' statement)? # ifStmt
| 'for' '(' forControl ')' statement # forStmt
| 'while' parExpression statement # whileStmt
;
All the alternatives are labelled.
I can get all StatementContext objects using this method
Trees.getAllRuleNodes(root,JavaParser.Rule_statement);
But if I am only interested in getting the IfStmtContext objects, how can I use the above method without using something like this
for(ParseTree tree : statementContextList)
{
if(tree instanceof IfStmtContext)
{
//add to a list
}
The generated JavaParser doesnt create rule indexes for labelled rules.
Do I have to customize the grammar in some way to make them indexed?
Or there is another ways do this?
My code should be fast and I need to remove as much as iterations and conditions as possible. Need to get rid of the 'instanceof' checks as well as possible

ANTLR4: Tree construction

I am extending the baseClass Listener and am attempting to read in some values, however there doesnt seem to be any hierrarchy in the order.
A cut down version of my grammar is as follows:
start: config_options+
config_options: (KEY) EQUALS^ (PATH | ALPHANUM) (' '|'\r'|'\n')* ;
KEY: 'key' ;
EQUALS: '=' ;
ALPHANUM: [0-9a-zA-Z]+ ;
However the parse tree of this implementation is flat at the config_options level (Terminal level) i.e.the rule start has many children of config_options but EQUALS is not the root of subtrees of config_options, all of the TOKENS have the rule config_options as root node. How can I make one of the terminals a root node instead?
In this particular rule I dont want any of the spaces to be captured, I know that there is the -> skip directed for the lexer however there are some cases where I do want the space. i.e. in String '"'(ALPHANUM|' ')'"'
(Note: the ^ does not seem to work)
an example for input is:
key=abcdefg
key=90weata
key=acbefg9
All I want to do is extract the key and value pairs. I would expect that the '=' would be the root and the two children would be the key and the value.
When you generate your grammar, you should be getting a syntax error over the use of the ^ operator, which was removed in ANTLR 4. ANTLR 4 generates parse trees, the roots of which are implicitly defined by the rules in your grammar. In other words, for the grammar you gave above the parse tree nodes will be start and config_options.
The generated config_options rule will return an instance of Config_optionsContext, which contains the following methods:
KEY() returns a TerminalNode for the KEY token.
EQUALS() (same for the EQUALS token)
PATH() (same for the PATH token)
ALPHANUM() (same for the ALPHANUM token)
You can call getSymbol() on a TerminalNode to get the Token instance.

Using from {x}.field in DSL in Drools

I have the following Drools DSL "sentence":
[when]The field {field} in the module {module} contains value {value}=$a : {module} ( {field} != null)
String( this.equalsIgnoreCase("{value}") ) from $a.{field}
where the field is a Set of Strings.
Now, if I have two of these rules, it obviously won't work as the variable $a occurs twice. So I wanted to improve the rule to make the variable, well, variable:
[when]The field {field} in the module {module} contains value {value} as {a}={a} : {module} ( {field} != null)
String( this.equalsIgnoreCase("{value}") ) from {a}.{field}
This doesn't work, I can't use the part {a}., that breaks.
So, my questions are: Is there either a way to rewrite the rules or a way to allow the {variable}. notation to work? Alternatively, is there a contains operator which works case insensitive?
After I subscribed to the Drools-Users mailing list, I got an answer:
http://drools.46999.n3.nabble.com/rules-users-Using-from-x-field-in-DSL-tt4017872.html
Summary: Bug in DSL parser, as a workaround add an extra letter after the variable on the RHS: ... as {a}={a}x (...) ... from {a}x.{field}

Resources