I’m using ANTLR4 and I have an "import" statement inside my grammar.
Does ANTLR4 have an option to automatically open and parse input file instead of doing it inside my visitor (creating another parser/lexer and visitor for each "import" declaration) ?
"Pretty" sure that I've already seen it but I can't find it anymore.
Inside my grammar :
importStatement : 'import' ID ';' // Here ? an action (Java code)
// to prepend an AST to my current AST ?
Inside an input files :
Import test;
There is no built-in functionality for this, primarily because every language requiring it has its own set of rules for how it needs to be done. In addition, this can quickly make the parse operation for your whole project go from O(n) to O(n²) (i.e. parsing each file once, to parsing up to the whole project for each file).
If your language allows you to build a correct parse tree prior to resolving the imports (e.g. it doesn't have arbitrary #define statements that can appear in imports), then you should be glad you aren't C/C++ and parse each file independently before resolving the import statements.
Related
I am creating a npm library where I need to read the files of the folder from where my library function were invoked from command line and then operate on those files.
By operation I mean to check if a variable exist, if a function exists, modifying variable, function,etc.
The files will be a Typescript files.
Any help on how to proceed will be great.
Seems like you need some kind of AST parser like Esprima or babel-parser. These tools can parse the content of JS/TS files, build the abstract syntax tree that can be traversed, modified and converted back to the source code.
There's a lot of useful tools available in Babel toolset that simplifies these operations. For example, babel-traverse simplifies searching the target statement or expression, babel-types that helps to match the type of the AST nodes and babel-generator that generates the source code from the AST.
It's going to be very difficult to get these answers without running the files.
So the best approach is probably to just import the files as usual and see what side-effects running the files had. For example, you can check if a file exported anything.
If this doesn't solve your problem, you will have to parse the files. The best way to do that might be to use the typescript compiler itself:
https://github.com/microsoft/TypeScript/wiki/Using-the-Compiler-API
I'm using COBOL grammar files from below URL:
https://github.com/antlr/grammars-v4/tree/master/cobol85
From the given source, there are 2 grammar files which are Cobol85.g4 and Cobol85Preprocessor.g4.
Both work like a charm if I deal separately like the following:
~$ antlr4 -Dlanguage=Python2 Cobol85
and
~$ antlr4 -Dlanguage=Python2 Cobol85Preprocessor
However, I realize, only Cobol85Preprocessor able to understand comments in COBOL. On the other hand, Cobol85 grammar file don't. My best tought, maybe I need to import both together into a single file.
So, I created another grammar file named Cobol.g4 which contains below code:
grammar Cobol;
import Cobol85Preprocessor, Cobol85;
and compiled it with the following command:
~$ antlr4 -Dlanguage=Python2 Cobol
Good news, I found no problem compiling it. The bad news, it doesn't work perfectly compare to the previous method (deal grammar files separately).
Instead, I received the below error message:
line 1:30 extraneous input '.\r\n ' expecting {<EOF>, ADATA, ADV...
Is there any way to solve this or by design, I should deal both separately? Could anyone please help me with this issue?
PS: I'm not sure if this piece of information will be useful. I'm using Antlr 4.7.1 with Listener.
Disclaimer: I am the author of these COBOL ANTLR4 grammar files.
The parser generated from grammar Cobol85.g4 has to be provided with COBOL source code, which has been preprocessed with a COBOL preprocessor. Cobol85Preprocessor.g4 is at the core of this preprocessor and enables parsing of statements such as COPY REPLACE, EXEC SQL etc.
Cobol85Preprocessor.g4 is meant to be augmented with quite extensive additional logic, which is not included in the grammar files and enables normalization of line formats, line breaks, comment lines, comment entries, EXEC SQL, EXEC CICS and so on. This missing code is leading to the problems you are noticing.
The ProLeap COBOL parser written by me implements all of this in Java based on the files Cobol.g4 and Cobol85Preprocessor.g4. However, there is no Python implementation, yet.
I am using antlr 4 with java ... I implement my own listener and save the errors produce by antlr4 in list ... and I build my ast ... My question is:
there are any method to knew if I have an error in my listener then stop building ast ?
You can try implementing your own default error strategy and simply return whatever error is caught and break from its current action to the method where you output your ast.
See these links for help:
Default error strategy
The antlr guy's guide
(my recommended reading for this issue)
Once you implement your own default error strategy you could simply have a method within in keep a list of all the errors and return those, or use a not null check to see whether to continue onto the next step of your parsing.
Hope this helps in anyway and good luck on your project.
I'm pretty new to Xtext, so I don't understand very well all of the associated concepts. There's one question in particular I couldn't find an answer to: how can I manage a grammar for a language with multiple files?
The DSL I'm working on typically uses four files, three of which should be referenced in the first one. All files share the same extension, though not the same grammar. Is that possible at all?
How can I manage a grammar for a language with multiple files?
Xtext first parses the file, and then links crossreferences. These crossreferences can be "internal" in a file or "external". In both cases the linking and the scoping ystems will do the hard work for you.
All files share the same extension, though not the same grammar. Is that possible at all?
This seems to be a different question, but alas...
If the grammars are really different then you will have a hard time with Xtext. If Xtext sees a .foo file, how should it decide, which parser should be applied? Try each one until no error occurs? And what if the file is written in grammar B but really contains syntax errors? ...
But often there is a little trick: The is really one grammar, but the grammar contains two nearly separate parts. Which part is used is calculated by the first few keywords in the file.
A small example:
File A.foo:
module A {
// more stuff here
}
module B {
// also more stuff
}
File B.foo:
system X {
use module A
use module B
}
The grammar might look like this:
Model: Modules | Systems;
Modules: modules += Module;
Module: 'module' name=ID '{' '}';
Systems: systems += System;
System: 'system' name=ID '{' used+=UsedModule* '}';
UsedModule: 'use' 'module' module=[Module];
In this grammar one file can only contain either module XOR system definitions but not a mix of them. The first occurrence of the keyword module or system determines what is allowed.
In my current work, I have written code generator using String Template without thinking about Parser ( I am instantiating Template files using direct Java Object). and code generator generator generates nice Java code.
Now, I have started to write Parser. B'coz of some nice editor features of xText, I am thinking to write parser in Xtext.
My question is "Is it possible to use code generator ( written using StringTemplate ) and Parse (written in Xtext) in same project?
Yes that's possible. Xtext offers a typed AST for the parsed files and you could easily pass them to your code generator (directly, iff they fulfil the same contract / interfaces, or indirectly by transforming them to the expected structure). Xtext does not impose any constraints on how you want to use the parsed information.