How to use ANTLR for python target? - python-3.x

I'm using ANTLRWorks 1.5.2 for grammar creation and subsequent creation of the lexer and parser. I did that for Java target, but my preferred language is python. I'm quite puzzled by this: how can I specify my target language in ANTLRWorks 1.5.2 and get lexer and parser in python? I read somewhere that antlrworks in just for java target.
How can I install antlr3 and use python runtime?
I would be really appriciate if anyone can guid me.
thanks.

If you use the following options {...} block in your grammar:
options {
language=Python;
}
and then press CTRLSHIFT + G from within ANTLRWorks, the *.py lexer and parser files will be generated in the grammar's output/ directory.
However, debugging from within ANTLRWorks only works with the Java target.
As for a complete Python example, checkout this previous Q&A: ANTLR get and split lexer content

Related

Python 3.10 antlr parser

I have been using an antlr v4 grammar to parse code for my UML diagrammer. I would like to just replace my current 3.6 .g4 file with a 3.10 one. My current grammar does not support the new Python match statement.
I found the following but it generates Java code. I need one that generates Python code.
Does anyone know if such a thing exists?

How to compile this COBOL grammar files?

I'm using COBOL grammar files from below URL:
https://github.com/antlr/grammars-v4/tree/master/cobol85
From the given source, there are 2 grammar files which are Cobol85.g4 and Cobol85Preprocessor.g4.
Both work like a charm if I deal separately like the following:
~$ antlr4 -Dlanguage=Python2 Cobol85
and
~$ antlr4 -Dlanguage=Python2 Cobol85Preprocessor
However, I realize, only Cobol85Preprocessor able to understand comments in COBOL. On the other hand, Cobol85 grammar file don't. My best tought, maybe I need to import both together into a single file.
So, I created another grammar file named Cobol.g4 which contains below code:
grammar Cobol;
import Cobol85Preprocessor, Cobol85;
and compiled it with the following command:
~$ antlr4 -Dlanguage=Python2 Cobol
Good news, I found no problem compiling it. The bad news, it doesn't work perfectly compare to the previous method (deal grammar files separately).
Instead, I received the below error message:
line 1:30 extraneous input '.\r\n ' expecting {<EOF>, ADATA, ADV...
Is there any way to solve this or by design, I should deal both separately? Could anyone please help me with this issue?
PS: I'm not sure if this piece of information will be useful. I'm using Antlr 4.7.1 with Listener.
Disclaimer: I am the author of these COBOL ANTLR4 grammar files.
The parser generated from grammar Cobol85.g4 has to be provided with COBOL source code, which has been preprocessed with a COBOL preprocessor. Cobol85Preprocessor.g4 is at the core of this preprocessor and enables parsing of statements such as COPY REPLACE, EXEC SQL etc.
Cobol85Preprocessor.g4 is meant to be augmented with quite extensive additional logic, which is not included in the grammar files and enables normalization of line formats, line breaks, comment lines, comment entries, EXEC SQL, EXEC CICS and so on. This missing code is leading to the problems you are noticing.
The ProLeap COBOL parser written by me implements all of this in Java based on the files Cobol.g4 and Cobol85Preprocessor.g4. However, there is no Python implementation, yet.

How to parse python file in Nodejs?

I'm in a project, I need to parse python file, get the doc string, the properties, and the class name by Nodejs. I know there is a ast module in python to parse python source file to a syntax tree, is there similar module in Nodejs so that I can parse python source file?
There are Python grammars available for ANTLR.
And ANTLR can generate JS parsers.
Seems like you should be able to parse Python in JS that way.
ANTLR grammars: https://github.com/antlr/grammars-v4
ANTLR JavaScript target: https://github.com/antlr/antlr4/blob/master/doc/javascript-target.md
More info: http://www.antlr.org/
ANTLR is a LL(*) parser generator, the successor to PCCTS. Parsing languages is a complicated subject and actually parsing Python may be an overkill for what you really need but since you ask for a specific solution to your problem then this is something that should get you started.

Antlr4 Python3 target visitor not usable?

I try to follow the Antlr4 reference book, with the Python3 target, but I got stuck in the calculator example. On the Antlr4 docs it says
The Python implementation of AntLR is as close as possible to the Java one, so you shouldn't find it difficult to adapt the examples for Python
but I don't get it yet.
The java code visitor has a .visit method and in python I don't have this method. I think it's because in java the visit method had parameter overloads of the tokens. In python we have visitProg(), visitAssign(), visitId() etc. But now I can't write value = self.visit(ctx.expr()) because we don't know what visit to call?
Or am I missing an instruction somewhere?
Looks like sometime in the last 3+ years this was fixed. I generated a parser from a grammar and targeted Python 3, using:
antlr4 -Dlanguage=Python3 -no-listener -visitor mygrammar.g4
It generates a visitor class that subclasses ParseTreeVisitor, which is a class in the antlr4-python3-runtime. Looking at the ParseTreeVisitor class, there is a visit method.
For those interested in working through the The Definitive ANTLR 4 Reference using Python, the ANTLR4 documentation points you towards this github repo:
https://github.com/jszheng/py3antlr4book
The Python2/3 targets do not yet have a visitor implemented. I tried to implement it myself, and a pull request is send to that antlr guy to see if I did it correctly. Follow the pull request here: https://github.com/antlr/antlr4-python3/pull/6

Simplest way to deal with "import" statement in ANTLR4

I’m using ANTLR4 and I have an "import" statement inside my grammar.
Does ANTLR4 have an option to automatically open and parse input file instead of doing it inside my visitor (creating another parser/lexer and visitor for each "import" declaration) ?
"Pretty" sure that I've already seen it but I can't find it anymore.
Inside my grammar :
importStatement : 'import' ID ';' // Here ? an action (Java code)
// to prepend an AST to my current AST ?
Inside an input files :
Import test;
There is no built-in functionality for this, primarily because every language requiring it has its own set of rules for how it needs to be done. In addition, this can quickly make the parse operation for your whole project go from O(n) to O(n²) (i.e. parsing each file once, to parsing up to the whole project for each file).
If your language allows you to build a correct parse tree prior to resolving the imports (e.g. it doesn't have arbitrary #define statements that can appear in imports), then you should be glad you aren't C/C++ and parse each file independently before resolving the import statements.

Resources