So I have written my grammar in antlr4 syntax. Then I setup codegeneration, and now I can parse source files in my own defined language. This works great!
The next step I took is to create an object model from the expression tree. This is also working well.
However, now I want to generate an expression from my object model.
Can I generate code using the generated language parser objects API? Obviously, I can write methods that hand-generates strings. But I want to use a geenrated API based on the grammar to achieve some level of type safety and to detect errors when I make a grammar change.
I'm using the latest antlr4: antlr 4.7.1.
There's no generated solution. You have to wire this all up manually.
Related
We are working on a tool to validate user configurations. Invalid configurations will be described in some text file or json file in following form:
case1: if something > 5 and something.else != 10
case2: (if a <= 3 or a >= 5) and b == 10
In case the if statement evaluates to true, the configuration is invalid. We used SLY module to create a lexer and parser to parse this sentence and check, whether it's valid or not. After thinking a bit more, we realized, that instead of writing our own grammar, it would be interesting to use a subset of the Python grammar - let's say expressions, bool operators and few others, but not the complete set, as we don't want and need to incorporate support for functions, classes and many more. The reason for such approach was, that we are writing our tool in Python, so it could cooperate nicely.
I've checked the ast module, however, I've a feeling, that the grammar is tightly coupled with it. If I understand it correctly, the Python parser is not generated automatically using some existing parser generator based on a grammar, right? The parser is "hard coded". Or em I wrong?
Is there "simple" way of doing this?
In general, we are looking for a parser generator, which generates the parser for a subset of Python grammar, but I'm afraid to cover part of the Python grammar, we would need to write the grammar by ourselves and based on it generate a parser. Is my assumption right?
I am currently trying to implement a ruby compiler.To create the parser and lexer I used Antlr4. Now i am unable to figure out how to implement semantic analysis into the parser.Can someone explain how to do semantic analysis using the generated parser?It would be better if you can explain with a simple example, say how to check if a variable is initialized before use.
Well I can't describe everything you can and have to do but I will try to show you the principle behind it...
ANTLR generates a ParseTree for you which you can then process with a ParseTreeWalker. That walker will go through the parse tree node by node starting at the topmost, then processes through all children (Though that behaviour can be specified as far as I know). If you have registered a ParseTreeListener to the walker it will get notified about each step of it. There are two methods for each parser rule in your grammar: One that gets notfied whenever the parser enters this rule (before the children of that node are visited) and one when the parser exits the rule (after all children of the respective node have been visited).
This ParseTreeListener is where you can do your semantic analysis. You mentioned the check for undefined variables: For that you have to hook up your declaration rule, read out the variable name and store it in a List. Now you can hook up each rule that can contain a variable, read the name of it out and check whether it is in your list of declared variables. If not then the variable is undefined.
As an example on how something like that can be done you can have a look at a ParseTreeListener of mine here. The corresponding grammar can be found here.
I am writing a grammar that needs some custom code written in its target language. It is fairly easy to add e.g.
#parser::members {
}
The problem is that I am targeting multiple languages, and I haven't found a way to target multiple languages without copy+pasting the entire grammar.
Is there a way without resorting to copy+paste or external preprocessors?
I'm afraid there is no solution. Action code is by definition written in the target language, as it is directly copied from the grammar to the generated files. If you have target languages that all can handle #ifdef #endif (say, C, C++ and Obj-C) then you could use that to separate individual code parts. Otherwise you could use a base grammar with placeholders and process that in a pre-compilation step (where you generate your parsers/lexers) and replace the placeholders with the real target code. That even makes the grammar cleaner.
Are there any tools available for producing a parse tree of NodeJS code? The closest I can find is the closure compiler but I don't think that will give me a parseable tree for analysis.
Node.js is just JavaScript (aka ECMAScript). I recommend Esprima (http://esprima.org/), which supports the de-facto standard AST established by Mozilla.
Esprima generates very nice ASTs, but if you need the parse tree you must look elsewhere. Esprima only returns ASTs and the sequence of tokens for some text. If you don't want to model the language yourself, you could use another tool like ANTLR (see: https://stackoverflow.com/a/5982455/206543).
For the difference between ASTs and parse trees, look here: https://stackoverflow.com/a/9864571/206543
Say I'd like to find instances of the expression while using the Java7 grammar:
FoobarClass.getInstanceOfType("Bazz");
Using a ParseTreeWalker and listening to exitExpression() calls sounded like a good first place to start. What surprised me was the level of manual traversal of the Java7Parser.ExpressionContext required to find expressions of this type.
What's the appropriate method to find matches to the above expression? At this point using a Regex in place of ANTLR4 yields simpler code, but this won't scale.
ANTLR 4 does not currently include feature allowing you to write concrete or abstract syntax queries. We hope to add something in the future to help with this type of application.
I've needed to write a few pattern recognition features for ANTLR 4 parse trees. I implemented the predicate itself with relative success by extending BaseMyParserVisitor<Boolean> (the parser in this example is called MyParser).