Antlr4 c++ target looks like java - antlr4

I generated a c++ python parser with antlr4 c++ target but when I try to use it I have the following error:
Python3Lexer.h:48:5: error: stray ‘#’ in program
This Python3Lexer.h (generated with Antlr4 c++ target) does not look good?
The error line is #Override which is a java keyword, not c++!
Do you know what I am doing wrong?
Here is what this Python3Lexer.h looks like:
#include "antlr4-runtime.h"
class Python3Lexer : public antlr4::Lexer {
public:
(...)
Python3Lexer(antlr4::CharStream *input);
~Python3Lexer();
// A queue where extra tokens are pushed on (see the NEWLINE lexer rule).
private java.util.LinkedList<Token> tokens = new java.util.LinkedList<>();
// The stack that keeps track of the indentation level.
private java.util.Stack<Integer> indents = new java.util.Stack<>();
// The amount of opened braces, brackets and parenthesis.
private int opened = 0;
// The most recently produced token.
private Token lastToken = null;
#Override
public void emit(Token t) {
super.setToken(t);
tokens.offer(t);
}
#Override
public Token nextToken() {
(...)

If you look inside the grammar file you're using (which you've linked in a comment), you'll see that it contains Java code. In order to use this grammar in C++, you'll first have to translate that Java code to C++.

Related

eliminating embedded actions from antlr4 grammar

I have an antlr grammar in which embedded actions are used to collect data bottom up and build aggregated data structures. A short version is given below, where the aggregated data structures are only printed (ie no classes are created for them in this short sample code).
grammar Sample;
top returns [ArrayList l]
#init { $l = new ArrayList<String>(); }
: (mid { $l.add($mid.s); } )* ;
mid returns [String s]
: i1=identifier 'hello' i2=identifier
{ $s = $i1.s + " bye " + $i2.s; }
;
identifier returns [String s]
: ID { $s = $ID.getText(); } ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
Its corresponding Main program is:
public class Main {
public static void main( String[] args) throws Exception
{
SampleLexer lexer = new SampleLexer( new ANTLRFileStream(args[0]));
CommonTokenStream tokens = new CommonTokenStream( lexer );
SampleParser parser = new SampleParser( tokens );
ArrayList<String> top = parser.top().l;
System.out.println(top);
}
}
And a sample test is:
aaa hello bbb
xyz hello pqr
Since one of the objectives of antlr is to keep the grammar file reusable and action-independent, I am trying to delete the actions from this file and move it to a tree walker. I took a first stab at it with the following code:
public class Main {
public static void main( String[] args) throws Exception
{
SampleLexer lexer = new SampleLexer( new ANTLRFileStream(args[0]));
CommonTokenStream tokens = new CommonTokenStream( lexer );
SampleParser parser = new SampleParser( tokens );
ParseTree tree = parser.top();
ParseTreeWalker walker = new ParseTreeWalker();
walker.walk( new Walker(), tree );
}
}
public class Walker extends SampleBaseListener {
public void exitTop(SampleParser.TopContext ctx ) {
System.out.println( "Exit Top : " + ctx.mid() );
}
public String exitMid(SampleParser.MidContext ctx ) {
return ctx.identifier() + " bye "; // ignoring the 2nd instance here
}
public String exitIdentifier(SampleParser.IdentifierContext ctx ) {
return ctx.ID().getText() ;
}
}
But obviously this is wrong, because at the least, the return types of the Walker methods should be void, so they dont have a way to return aggregated values upstream. Secondly, I dont see a way how to access the "i1" and "i2" from the walker code, so I am not able to differentiate between the two instances of "identifier" in that rule.
Any suggestions on how to separate the actions from the grammar for this purpose?
Should I use a visitor instead of a listener here, since the visitor has the capability of returning values? If I use a visitor, how do I solve the problem of differentiating between "i1" and "i2" (as mentioned above)?
Does a visitor perform its action only at the exit of a rule (unlike the listeners, which exist for both entry and exit)? For example, if I have to initialize the list at the entry of rule "top", how can I do it with a visitor, which executes only at the conclusion of a rule? Do I need a enterTop listener for that purpose?
EDIT: After the initial post, I have modified the rule "top" to create and return a list, and pass this list back to the main program for printing. This is to illustrate why I need an initialization mechanism for the code.
Based on what you are trying to do I think you may benefit from using ANTLR's BaseVisitor Class rather than the BaseListener Class.
Assuming your grammar is this (I generalized it and I'll explain the changes below):
grammar Sample;
top : mid* ;
mid : i1=identifier 'hello' i2=identifier ;
identifier : ID ;
ID : [a-z]+ ;
WS : [ \t\r\n]+ -> skip ;
Then your Walker would look like this:
public class Walker extends SampleBaseVisitor<Object> {
public ArrayList<String> visitTop(SampleParser.TopContext ctx) {
ArrayList<String> arrayList = new ArrayList<>();
for (SampleParser.MidContext midCtx : ctx.mid()) {
arrayList.add(visitMid(midCtx));
}
return arrayList;
}
public String visitMid(SampleParser.MidContext ctx) {
return visitIdentifier(ctx.i1) + " bye " + visitIdentifier(ctx.i2);
}
public String visitIdentifier(SampleParser.IdentifierContext ctx) {
return ctx.getText();
}
}
This allows you to visit and get the result of any rule you want.
You are able to access i1 and i2, as you labeled them through the visitor methods. Note that you don't really need the identifier rule since it contains only one token and you can access a token's text directly in the visitMid, but really it's personal preference.
You should also note that SampleBaseVisitor is a generic class, where the generic parameter determines the return type of the visit methods. For your example I set the generic parameter Object, but you could even make your own class which contains the information you want to preserve and use that for your generic parameter.
Here are some more useful methods which BaseVisitor inherits which may help you out.
Lastly, your main method would end up looking something like this:
public static void main( String[] args) throws IOException {
FileInputStream fileInputStream = new FileInputStream(args[0]);
SampleLexer lexer = new SampleLexer(CharStreams.fromStream(fileInputStream));
CommonTokenStream tokens = new CommonTokenStream(lexer);
SampleParser parser = new SampleParser(tokens);
for (String string : new Walker().visitTop(parser.top())) {
System.out.println(string);
}
}
As a side note, the ANTLRFileStream class is deprecated in ANTLR4.
It is recommend to use CharStreams instead.
As Terence Parr points out in the Definitive Reference, one main difference between Visitor and Listener is that the Visitor can return values. And that can be convenient. But Listener has a place too! What I do for listener is exemplified in this answer. Granted, there are simpler ways of parsing a list of numbers, but I made that answer to show a complete and working example of how to aggregate return values from a listener into a public data structure that can be consumed later.
public class ValuesListener : ValuesBaseListener
{
public List<double> doubles = new List<double>(); // <<=== SEE HERE
public override void ExitNumber(ValuesParser.NumberContext context)
{
doubles.Add(Convert.ToDouble(context.GetChild(0).GetText()));
}
}
Looking closely at the Listener class, I include a public data collection -- a List<double> in this case -- to collect values parsed or calculated in the listener events. You can use any data structure you like: another custom class, a list, a queue, a stack (great for calculations and expression evaluation), whatever you like.
So while the Visitor is arguably more flexible, the Listener is a strong contender too, depending on how you want to aggregate your results.

Compilation customizer is not called during compilation of groovy DSL script

I want to write a groovy DSL with syntax:
returnValue when booleanCondition
I want to use compilation customizers to transform this expression to a typical if return statement using AST transformations.
For this script:
2 when 1 == 1
I get exception message:
MultipleCompilationErrorsException: startup failed:
Script1.groovy: 1: expecting EOF, found '1' # line 1, column 8.
I don't understand why my compilation customizer is not called at all?
I need it to be called before compilation so I can make it into a valid groovy code.
If the script contains valid groovy code, then my compilation customizer is called.
My code:
class MyDslTest {
public static void main(String[] args) {
String script = '''2 when 1 == 1
'''
def compilerConfig = new CompilerConfiguration()
compilerConfig.addCompilationCustomizers(new MyCompilationCustomizer())
GroovyShell groovyShell = new GroovyShell(compilerConfig)
groovyShell.evaluate(script)
}
}
class MyCompilationCustomizer extends CompilationCustomizer {
MyCompilationCustomizer() {
super(CompilePhase.CONVERSION)
}
#Override
void call(SourceUnit source, GeneratorContext context, ClassNode classNode) throws CompilationFailedException {
println 'in compilation customizer'
}
}
The problem is that your code is not syntactically valid. A compilation customizer will not workaround that: to be able to get an AST, on which the customizer will work, you have to produce syntactically correct code. One option is to use a different AntlrParserPlugin, but in general I don't recommend to do it because it will modify the sources before parsing, and therefore create a mismatch between the AST and the actual source.

How are enums defined in Preon?

I am trying to use the preon I compiled from github (v 1.1) to parse the messages I get from an embedded C++ application. I included antlr 3.3-complete version in my project. I defined the following class as a header for network messages:
public class Header {
#BoundNumber(byteOrder = org.codehaus.preon.buffer.ByteOrder.BigEndian)
public MessageType MsgType;
#BoundNumber(byteOrder = org.codehaus.preon.buffer.ByteOrder.BigEndian)
public int MsgNo;
#BoundNumber(byteOrder = org.codehaus.preon.buffer.ByteOrder.BigEndian)
public int RspNo;
#BoundNumber(byteOrder = org.codehaus.preon.buffer.ByteOrder.BigEndian)
public int Length;
}
MessageType enum is as follows:
public enum MessageType{
#BoundEnumOption(0x0000) Dummy1,
#BoundEnumOption(0x0001) Dummy2
}
I try to cast the received network buffer as following:
Codec<Header> headerCodec = Codecs.create(Header.class);
Header h = Codecs.decode(headerCodec, headerData);
System.out.println(h);
I get the following antlr error. Is there something wrong with my definitions, or my included packages?
line 1:0 no viable alternative at input '< EOF >'
Thanks
I found the problem. It seems for enumerations you have to explicitly provide a size value in BoundNumber annotation as following:
#BoundNumber(ByteOrder.BigEndian, size="32")
public MessageType MsgType;

Can I use something like DebuggerTypeProxyAttribute on a type that I don't own?

I've got an IClaimsPrincipal variable, and I'd like to see how many claims are in it. Navigating through the properties in the watch window is complicated, so I'd like to customize how this object is displayed.
I'm aware of the [DebuggerTypeProxy] attribute, which initially looked like it might do what I want. Unfortunately, it needs to be attached to the class, and I don't "own" the class. In this case it's a Microsoft.IdentityModel.Claims.ClaimsPrincipal.
I'd like to display the value of IClaimsPrincipal.Identities[0].Claims.Count.
Is there any way, using [DebuggerTypeProxy] or similar, to customize how the value of a type that I don't own is displayed in the watch window?
Example of DebuggerTypeProxyAttribute applied to KeyValuePair showing only the Value member:
using System.Collections.Generic;
using System.Diagnostics;
[assembly: DebuggerTypeProxy(typeof(ConsoleApp2.KeyValuePairDebuggerTypeProxy<,>), Target = typeof(KeyValuePair<,>))]
// alternative format [assembly: DebuggerTypeProxy(typeof(ConsoleApp2.KeyValuePairDebuggerTypeProxy<,>), TargetTypeName = "System.Collections.Generic.KeyValuePair`2")]
namespace ConsoleApp2
{
class KeyValuePairDebuggerTypeProxy<TKey, TValue>
{
private KeyValuePair<TKey, TValue> _keyValuePair; // beeing non-public this member is hidden
//public TKey Key => _keyValuePair.Key;
public TValue Value => _keyValuePair.Value;
public KeyValuePairDebuggerTypeProxy(KeyValuePair<TKey, TValue> keyValuePair)
{
_keyValuePair = keyValuePair;
}
}
class Program
{
static void Main(string[] args)
{
var dictionary = new Dictionary<int, string>() { [1] = "one", [2] = "two" };
Debugger.Break();
}
}
}
Tested on Visual Studio 2017
The best I've come up with so far is to call a method:
public static class DebuggerDisplays
{
public static int ClaimsPrincipal(IClaimsPrincipal claimsPrincipal)
{
return claimsPrincipal.Identities[0].Claims.Count;
}
}
...from the watch window:
DebuggerDisplays.ClaimsPrincipal(_thePrincipal),ac = 10
The ",ac" suppresses the "This expression causes side effects and will not be evaluated".
However, note that when this goes out of scope, Visual Studio will simply grey out the watch window entry, even with the ",ac". To avoid this, you'll need to ensure that everything is fully qualified, which means that you'll end up with extremely long expressions in the watch window.

Iterating over a type's members at compile time

Is there any statically-typed, strongly-type compiled language that provides a functionality to iterate over a type's members at compile time and generate templated code for each one? For example, it could be something like:
// in pseudo-C#
public static void AddParameter(string parameterName, object value) { /* ... */ }
public static void AddParameters<T>(T parameters) {
// Of course, the memberof(T), membersof(T), membername(<member>)
// and membervalue(<member>, object) operators would be valid
// inside a "compile for" block only
compile for (memberof(T) member in membersof(T))
AddParameter(membername(member), membervalue(member, parameters));
/* If this were actual C#, the "compile for" block could even have a where clause */
}
So, if the following call was made:
StaticClass.AddParameters(new { UserID = "eleon", Password = "Gu3$$17" });
Then that particular instantiation of AddParameters would be unrolled to
public static void AddParameters(InternalNameOfTheAnonymousType parameters) {
AddParameters("UserID", parameters.UserID);
AddParameters("Password", parameters.Password);
}
At compile-time (if it were actual C# at IL-to-native compile time)
You can do it with Nemerle.
The syntactic briars are thick here, so it's hard for me to see what you're getting at, but I think Haskell's Scrap Your Boilerplate might be powerful enough to do the trick. It certainly is capable of some amazing compile-time generic metaprogramming.

Resources