Having problems with ANTLR4 in Python and getting an ErrorListener to work

Having problems with ANTLR4 in Python and getting an ErrorListener to work - antlr4

I am declaring my error listener like this:
class GeneratorErrorListener(ErrorListener):
def __init__(self, listener):
super().__init__()
self.listener = listener
def systaxError(self, recognizer, offendingSymbol, line, col, msg, e):
log_it("Syntax error at line {} col {}: {}".format(line, col, msg))
I am not yet making use of the listener passed in, but will when I get it working.
and setting it up like this:
...
# Set up new error listener
parser.removeErrorListeners()
parser.addErrorListener(GeneratorErrorListener(listener))
tree = parser.protocol()
...
walker.walk(listener, tree)
Then I am testing it with some input that has a syntax error (AFAICS):
The grammar fragment is:
enumEltDecl : INT '=' ID ( ':' STRING)?
| 'default' '=' STRING
;
enumDecl: 'enum' ID ( ':' ID )? '{' enumEltDecl (',' enumEltDecl )*
(',')? '}' ;
and I can parse those things fine. However, the following input which I think should be a syntax error, and does cause parsing to stop, does not invoke the error listener:
emum some_emum:uint8 {
};
It should have at least one enumEltDecl.
Any thoughts on what I have done wrong? I have looked at the runtime code for the ErrorListener class and it seems straightforward.
More Information
The code is here: https://gitlab.com/realrichardsharpe/wireshark-generator-python
Use the following steps to see the issue:
cd src
./GenTool.py -t C ../test-data/syntax-error.proto
You will see the following output:
#include "config.h"
#include <epan/packet.h>
#include <epan/expert.h>
//Generating code for enum cmd_enum
enum cmd_enum {
CMD1 = 0x14;
CMD2 = 0x15;
CMD3 = 0x28;
CMD4 = 0x29;
CMD5 = 0x3C;
CMD5 = 0x3D;
};
//We have a uint8
static const range_string cmd_enum_rvals[] = {
{ 0, 19, "Reserved", }
{ 0x14, 0x14, "cmd1" },
{ 0x15, 0x15, "cmd2" },
{ 22, 39, "Reserved", }
{ 0x28, 0x28, "cmd3" },
{ 0x29, 0x29, "cmd4" },
{ 42, 59, "Reserved", }
{ 0x3C, 0x3C, "cmd5" },
{ 0x3D, 0x3D, "cmd6" },
{ 62. 255, "Reserved" },
};
And it stops without my ErrorListener being called. The ErrorListener is in GenTool.py.

Strangely, with a little rearrangement of the code and after lots of debugging it now seems to be working because I get the following errors with a different set of input data:
./GenTool.py -t C xxx.proto
line 3:0 mismatched input '}' expecting {'default', INT}
line 4:0 missing ';' at '<EOF>'
Syntax error on line 1
The first two lines are generated by my error listener.
UPDATE: With a little comparison, I discovered that my test case had a symbol in it that is not recognized by the grammar and things went off the rails at that point.
The real problem was that my grammar was incorrect. The first line should have been:
protocol : protoDecl+ EOF ;
In my original grammer the EOF was missing which caused the parser to stop when it hit something that did not match the grammer.

Related

How can I parse nested source files with ANTLR4 - Trying one more time

I found the code (reproduced below) in an article from Terrence Parr showing how INCLUDE files could be handled in ANTLR3 for Java. I tried to add this to a grammar I use with ANTLR4 (with a C++ target) but when I tried to generate a parser, I got the errors
error(50): : syntax error: '^' came as a complete surprise to me
error(50): : syntax error: mismatched input '->' expecting SEMI while matching a rule
error(50): : syntax error: '^' came as a complete surprise to me
error(50): : syntax error: '(' came as a complete surprise to me while matching rule preamble
and I have no idea what these error means. Can anyone explain and perhaps show me the way forward?
(NB: I'm not wild about polluting the grammar file with code, I'm using the visitor pattern but I'll take it if I can!)
Thanks
include_filename :
('a'..'z' | 'A'..'Z' | '.' | '_')+
;
include_statement
#init { CommonTree includetree = null; }
:
'include' include_filename ';' {
try {
CharStream inputstream = null;
inputstream = new ANTLRFileStream($include_filename.text);
gramLexer innerlexer = new gramLexer(inputstream);
gramParser innerparser = new gramParser(new CommonTokenStream(innerlexer));
includetree = (CommonTree)(innerparser.program().getTree());
} catch (Exception fnf) {
;
}
}
-> ^('include' include_filename ^({includetree}))
;

Starting with ANTLR4 it is no longer possible to manipulate the generated parse tree with grammar rules. In fact ANTLR3 generated an AST (abstract syntax tree), which is a subset of a parse tree (as generated by ANTLR4). That in turn means you cannot keep the tree rewrite syntax (the part starting with ->). Hence you should change the code to:
include_statement
#init { CommonTree includetree = null; }
:
'include' Include_filename ';' {
try {
CharStream inputstream = null;
inputstream = new ANTLRFileStream($include_filename.text);
gramLexer innerlexer = new gramLexer(inputstream);
gramParser innerparser = new gramParser(new CommonTokenStream(innerlexer));
includetree = (CommonTree)(innerparser.program().getTree());
} catch (Exception fnf) {
;
}
}
;

How can I configure flow.js to use comments when my eslint adds spacing in my function arguments?

My function is:
getConferenceNumberAndPin: (description = null /*: string */ , entryPoints = null /*: Array<object> */ ) => {
As you can see, it adds a space before the comma: */ , as well as one before the ).
I am using --fix with eslint, so the spacing is automatically added. But now flowjs complains:
Unexpected token ,, expected the token )
How can I get the 2 to play nicely?

I think the issue with the code is the placement of the type comments. Function parameters with defaults shows an example function declaration
function method(value: string = "default") { /* ... */ }
Notice that the type comes before the default value. Therefore, in your example, your function declaration would look like
function getConferenceNumberAndPin(
description: ?string = null,
entrypoints: ?Array<Object> = null
) { /* ... /* }
And, using the comment syntax (shortened the function name so it can be written on one line)
function f(description /*: ?string */ = null, entrypoints /*: ?Array<Object> */ = null): void {}
The spacing before and after the commas and parentheses should not matter. You can play around with your example at Try Flow to experiment with the spacing that eslint would insert.

how can i iterate from a list to output value?

I m getting from terraform 12, call a list of values
data "oci_core_instances" "test_instances" {
#Required
compartment_id = "${var.compartment_ocid}"
availability_domain = "${data.oci_identity_availability_domains.ads.availability_domains[0].name}"
}
// numInstances = 3 for my case
locals {
numInstances = length(data.oci_core_instances.test_instances.instances)
}
and i want to iterate like (pseudo code) :
# Output the result single element
output "format_instances_name_state" {
value = "${
for (i=0 ; i< 3; i++)
format("%s=>%s",data.oci_core_instances.test_instances.instances[i].display_name,data.oci_core_instances.test_instances.instances[i].state)
} "
}
how can i do this in terraform ?
i have tried this :
# Output the result single element
output "format_instances_name_state" {
value = "${
for i in local.numInstances :
format("%s=>%s",data.oci_core_instances.test_instances.instances[i].display_name,data.oci_core_instances.test_instances.instances[i].state)
} "
}
but i m getting this error:
Error: Extra characters after interpolation expression
on main.tf line 64, in output "format_instances_state_element_single":
63:
64: for i in local.numInstances :
Expected a closing brace to end the interpolation expression, but found extra
characters.
any ideas ?

It seems like what you really want here is a map from display name to state, in which case the following expression would produce that:
output "instance_states" {
value = {
for inst in data.oci_core_instances.test_instances.instances : inst.display_name => inst.state
}
}
If you really do need that list of strings with => inside for some reason, you can adapt the above to get it, like this:
output "format_instances_state_element_single" {
value = [
for inst in data.oci_core_instances.test_instances.instances : "${inst.display_name}=>${inst.state}"
]
}
In this second case the for expression is marked by [ ] brackets instead of { } braces, which means it will produce a list result rather than a map result.

Class parameter syntax errors

I am trying to learn to write puppet modules in a good way, so I've started looking around for tutorials and howto.
I've seen that users suggest writing the main class in the following way, but It's actually failing for me.
I am honestly a bit confused how the 2 blocks between brackets are actually connected, and so I might be not seeing an obvious error or real missing comma.
I am running Puppet 3.8 by the way
class icinga2 {
$version = 'present'
$enable = true
$start = true
} {
class{'icinga2::install': } ->
class{'icinga2::config': } ~>
class{'icinga2::service': } ->
Class["icinga2"]
}
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Syntax error at '{'; expected '}' at /etc/puppet/modules/icinga2/manifests/init.pp:5

Your problem here is that your parameters must be surrounded by (), not {}. Also, they should be commas separated.
class icinga2 (
$version = 'present',
$enable = true,
$start = true,
) {
class{'icinga2::install': } ->
class{'icinga2::config': } ~>
class{'icinga2::service': } ->
Class["icinga2"]
}

flex/bison fixing memory leaks with unexpected tokens

I have a flex bison application. For a few of my tokens, I copy out the yytext from flex using strdup. This works great except when there is an error of an unexpected token.
simple example
flex.l:
...
[a-zA-Z0-9]+ { lval.string = strdup(yytext); return IDENT };
[\{\}] { return yytext[0] };
...
and
parse.y
...
%destructor { free($$); } IDENT
%destructor { free($$->name); free($$->type); free($$); } tag
...
tag: IDENT '{' IDENT '}'
{
struct tag *mytag = malloc(sizeof(struct tag));
mytag->name = $1;
mytag->type = $3;
$<tag>$ = mytag;
}
...
Now suppose I hand it the input:
blah blah blah
The lexer will send up the first IDENT token, which gets pushed onto the stack. After the first token it's expecting a bracket token, but instead gets another IDENT token. This is a syntax error. The destructor will be called on the first IDENT token, but not on the second one (the unexpected one). I haven't been able to find a way to destruct the unexpected token. Does anyone know how I should do it?

I found that appropriate use of the 'error' token in flex prompts it to correctly call the destructor function. Go me!
parse.y
...
%destructor { free($$); } IDENT
%destructor { free($$->name); free($$->type); free($$); } tag
...
tags: tag tags | error tags | ;
tag: IDENT '{' IDENT '}'
{
struct tag *mytag = malloc(sizeof(struct tag));
mytag->name = $1;
mytag->type = $3;
$<tag>$ = mytag;
}
...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Having problems with ANTLR4 in Python and getting an ErrorListener to work - antlr4

Related

How can I parse nested source files with ANTLR4 - Trying one more time

How can I configure flow.js to use comments when my eslint adds spacing in my function arguments?

how can i iterate from a list to output value?

Class parameter syntax errors

flex/bison fixing memory leaks with unexpected tokens

Categories

Resources