How can I parse nested source files with ANTLR4 - Trying one more time - antlr4

I found the code (reproduced below) in an article from Terrence Parr showing how INCLUDE files could be handled in ANTLR3 for Java. I tried to add this to a grammar I use with ANTLR4 (with a C++ target) but when I tried to generate a parser, I got the errors
error(50): : syntax error: '^' came as a complete surprise to me
error(50): : syntax error: mismatched input '->' expecting SEMI while matching a rule
error(50): : syntax error: '^' came as a complete surprise to me
error(50): : syntax error: '(' came as a complete surprise to me while matching rule preamble
and I have no idea what these error means. Can anyone explain and perhaps show me the way forward?
(NB: I'm not wild about polluting the grammar file with code, I'm using the visitor pattern but I'll take it if I can!)
Thanks
include_filename :
('a'..'z' | 'A'..'Z' | '.' | '_')+
;
include_statement
#init { CommonTree includetree = null; }
:
'include' include_filename ';' {
try {
CharStream inputstream = null;
inputstream = new ANTLRFileStream($include_filename.text);
gramLexer innerlexer = new gramLexer(inputstream);
gramParser innerparser = new gramParser(new CommonTokenStream(innerlexer));
includetree = (CommonTree)(innerparser.program().getTree());
} catch (Exception fnf) {
;
}
}
-> ^('include' include_filename ^({includetree}))
;

Starting with ANTLR4 it is no longer possible to manipulate the generated parse tree with grammar rules. In fact ANTLR3 generated an AST (abstract syntax tree), which is a subset of a parse tree (as generated by ANTLR4). That in turn means you cannot keep the tree rewrite syntax (the part starting with ->). Hence you should change the code to:
include_statement
#init { CommonTree includetree = null; }
:
'include' Include_filename ';' {
try {
CharStream inputstream = null;
inputstream = new ANTLRFileStream($include_filename.text);
gramLexer innerlexer = new gramLexer(inputstream);
gramParser innerparser = new gramParser(new CommonTokenStream(innerlexer));
includetree = (CommonTree)(innerparser.program().getTree());
} catch (Exception fnf) {
;
}
}
;

Related

Having problems with ANTLR4 in Python and getting an ErrorListener to work

I am declaring my error listener like this:
class GeneratorErrorListener(ErrorListener):
def __init__(self, listener):
super().__init__()
self.listener = listener
def systaxError(self, recognizer, offendingSymbol, line, col, msg, e):
log_it("Syntax error at line {} col {}: {}".format(line, col, msg))
I am not yet making use of the listener passed in, but will when I get it working.
and setting it up like this:
...
# Set up new error listener
parser.removeErrorListeners()
parser.addErrorListener(GeneratorErrorListener(listener))
tree = parser.protocol()
...
walker.walk(listener, tree)
Then I am testing it with some input that has a syntax error (AFAICS):
The grammar fragment is:
enumEltDecl : INT '=' ID ( ':' STRING)?
| 'default' '=' STRING
;
enumDecl: 'enum' ID ( ':' ID )? '{' enumEltDecl (',' enumEltDecl )*
(',')? '}' ;
and I can parse those things fine. However, the following input which I think should be a syntax error, and does cause parsing to stop, does not invoke the error listener:
emum some_emum:uint8 {
};
It should have at least one enumEltDecl.
Any thoughts on what I have done wrong? I have looked at the runtime code for the ErrorListener class and it seems straightforward.
More Information
The code is here: https://gitlab.com/realrichardsharpe/wireshark-generator-python
Use the following steps to see the issue:
cd src
./GenTool.py -t C ../test-data/syntax-error.proto
You will see the following output:
#include "config.h"
#include <epan/packet.h>
#include <epan/expert.h>
//Generating code for enum cmd_enum
enum cmd_enum {
CMD1 = 0x14;
CMD2 = 0x15;
CMD3 = 0x28;
CMD4 = 0x29;
CMD5 = 0x3C;
CMD5 = 0x3D;
};
//We have a uint8
static const range_string cmd_enum_rvals[] = {
{ 0, 19, "Reserved", }
{ 0x14, 0x14, "cmd1" },
{ 0x15, 0x15, "cmd2" },
{ 22, 39, "Reserved", }
{ 0x28, 0x28, "cmd3" },
{ 0x29, 0x29, "cmd4" },
{ 42, 59, "Reserved", }
{ 0x3C, 0x3C, "cmd5" },
{ 0x3D, 0x3D, "cmd6" },
{ 62. 255, "Reserved" },
};
And it stops without my ErrorListener being called. The ErrorListener is in GenTool.py.
Strangely, with a little rearrangement of the code and after lots of debugging it now seems to be working because I get the following errors with a different set of input data:
./GenTool.py -t C xxx.proto
line 3:0 mismatched input '}' expecting {'default', INT}
line 4:0 missing ';' at '<EOF>'
Syntax error on line 1
The first two lines are generated by my error listener.
UPDATE: With a little comparison, I discovered that my test case had a symbol in it that is not recognized by the grammar and things went off the rails at that point.
The real problem was that my grammar was incorrect. The first line should have been:
protocol : protoDecl+ EOF ;
In my original grammer the EOF was missing which caused the parser to stop when it hit something that did not match the grammer.

Class parameter syntax errors

I am trying to learn to write puppet modules in a good way, so I've started looking around for tutorials and howto.
I've seen that users suggest writing the main class in the following way, but It's actually failing for me.
I am honestly a bit confused how the 2 blocks between brackets are actually connected, and so I might be not seeing an obvious error or real missing comma.
I am running Puppet 3.8 by the way
class icinga2 {
$version = 'present'
$enable = true
$start = true
} {
class{'icinga2::install': } ->
class{'icinga2::config': } ~>
class{'icinga2::service': } ->
Class["icinga2"]
}
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Syntax error at '{'; expected '}' at /etc/puppet/modules/icinga2/manifests/init.pp:5
Your problem here is that your parameters must be surrounded by (), not {}. Also, they should be commas separated.
class icinga2 (
$version = 'present',
$enable = true,
$start = true,
) {
class{'icinga2::install': } ->
class{'icinga2::config': } ~>
class{'icinga2::service': } ->
Class["icinga2"]
}

Using getAttribute to get the class name of a webelement in Native context

Went through the java docs of getAttribute. Couldn't understand the point mentioned as :
Finally, the following commonly mis-capitalized attribute/property
names are evaluated as expected: "class" "readonly"
Could someone confirm if webElement.getAttribute("class") shall return the class name of the element or not?
Edit : On trying this myself
System.out.println("element " + webElement.getAttribute("class"));
I am getting
org.openqa.selenium.NoSuchElementException
Note : The element does exist on the screen as I can perform actions successfully on the element :
webElement.click(); //runs successfully
Code:
WebElement webElement = <findElement using some locator strategy>;
System.out.println("element " + webElement.getAttribute("class"));
So the answer to the problem was answered on GitHub in the issues list of appium/java-client by #SergeyTikhomirov. Simple solution to this is accessing the className property as following :
webElement.getAttribute("className")); //instead of 'class' as mentioned in the doc
Method core implementation here : AndroidElement
According to this answer, yes you are doing it right. Your org.openqa.selenium.NoSuchElementException is thrown because selenium can't find the element itself.
The sidenote you have posted, about webElement.click() actually working, is unfortunately not included in the code you have posted. Since it is not a part of the actual question, I leave this answer without adressing it.
public String getStringAttribute(final String attr)
throws UiObjectNotFoundException, NoAttributeFoundException {
String res;
if (attr.equals("name")) {
res = getContentDesc();
if (res.equals("")) {
res = getText();
}
} else if (attr.equals("contentDescription")) {
res = getContentDesc();
} else if (attr.equals("text")) {
res = getText();
} else if (attr.equals("className")) {
res = getClassName();
} else if (attr.equals("resourceId")) {
res = getResourceId();
} else {
throw new NoAttributeFoundException(attr);
}
return res;
}

__LINE__ feature in Groovy

It is possible to get current line number by __LINE__ in Ruby or Perl.
For example:
print "filename: #{__FILE__}, line: #{__LINE__}"
Is there the same feature in Groovy?
Not directly, but you can get it through an Exception (or Throwable) stack trace. For example:
StackTraceElement getStackFrame(String debugMethodName) {
def ignorePackages = [
'sun.',
'java.lang',
'org.codehaus',
'groovy.lang'
]
StackTraceElement frame = null
Throwable t = new Throwable()
t.stackTrace.eachWithIndex { StackTraceElement stElement, int index ->
if (stElement.methodName.contains(debugMethodName)) {
int callerIndex = index + 1
while (t.stackTrace[callerIndex].isNativeMethod() ||
ignorePackages.any { String packageName ->
t.stackTrace[callerIndex].className.startsWith(packageName)
}) {
callerIndex++
}
frame = t.stackTrace[callerIndex]
return
}
}
frame
}
int getLineNumber() {
getStackFrame('getLineNumber')?.lineNumber ?: -1
}
String getFileName() {
getStackFrame('getFileName')?.fileName
}
String getMethodName() {
getStackFrame('getMethodName')?.methodName
}
def foo() {
println "looking at $fileName:$lineNumber ($methodName)"
}
foo()
// ==> looking at test.groovy:39 (foo)
A word of caution though: getting the line number, file name, or method like this is very slow.
I'm not an expert in Groovy, but I don't think so. I know that Java and C# don't have it.
The __LINE__ feature really started to help with debugging in C. C doesn't have exceptions or many of the other features modern languages have, but it did have macros that the compiler could expand anywhere in the code, which is why we needed __FILE__, __LINE__, etc to let us know where we were when something bad happened. This is how assert works in C and C++. The JVM has very good debugging tools, and combined with assert and exceptions, you can very easily pinpoint where something went wrong (stack traces are way better than just a line number anyway).
I believe the reason Ruby and Perl have those macros is because they were created by C hackers. I've never used either of those languages enough to know the level of debugging support or how useful the macros really are.

flex/bison fixing memory leaks with unexpected tokens

I have a flex bison application. For a few of my tokens, I copy out the yytext from flex using strdup. This works great except when there is an error of an unexpected token.
simple example
flex.l:
...
[a-zA-Z0-9]+ { lval.string = strdup(yytext); return IDENT };
[\{\}] { return yytext[0] };
...
and
parse.y
...
%destructor { free($$); } IDENT
%destructor { free($$->name); free($$->type); free($$); } tag
...
tag: IDENT '{' IDENT '}'
{
struct tag *mytag = malloc(sizeof(struct tag));
mytag->name = $1;
mytag->type = $3;
$<tag>$ = mytag;
}
...
Now suppose I hand it the input:
blah blah blah
The lexer will send up the first IDENT token, which gets pushed onto the stack. After the first token it's expecting a bracket token, but instead gets another IDENT token. This is a syntax error. The destructor will be called on the first IDENT token, but not on the second one (the unexpected one). I haven't been able to find a way to destruct the unexpected token. Does anyone know how I should do it?
I found that appropriate use of the 'error' token in flex prompts it to correctly call the destructor function. Go me!
parse.y
...
%destructor { free($$); } IDENT
%destructor { free($$->name); free($$->type); free($$); } tag
...
tags: tag tags | error tags | ;
tag: IDENT '{' IDENT '}'
{
struct tag *mytag = malloc(sizeof(struct tag));
mytag->name = $1;
mytag->type = $3;
$<tag>$ = mytag;
}
...

Resources