I'm using COBOL grammar files from below URL:
https://github.com/antlr/grammars-v4/tree/master/cobol85
From the given source, there are 2 grammar files which are Cobol85.g4 and Cobol85Preprocessor.g4.
Both work like a charm if I deal separately like the following:
~$ antlr4 -Dlanguage=Python2 Cobol85
and
~$ antlr4 -Dlanguage=Python2 Cobol85Preprocessor
However, I realize, only Cobol85Preprocessor able to understand comments in COBOL. On the other hand, Cobol85 grammar file don't. My best tought, maybe I need to import both together into a single file.
So, I created another grammar file named Cobol.g4 which contains below code:
grammar Cobol;
import Cobol85Preprocessor, Cobol85;
and compiled it with the following command:
~$ antlr4 -Dlanguage=Python2 Cobol
Good news, I found no problem compiling it. The bad news, it doesn't work perfectly compare to the previous method (deal grammar files separately).
Instead, I received the below error message:
line 1:30 extraneous input '.\r\n ' expecting {<EOF>, ADATA, ADV...
Is there any way to solve this or by design, I should deal both separately? Could anyone please help me with this issue?
PS: I'm not sure if this piece of information will be useful. I'm using Antlr 4.7.1 with Listener.
Disclaimer: I am the author of these COBOL ANTLR4 grammar files.
The parser generated from grammar Cobol85.g4 has to be provided with COBOL source code, which has been preprocessed with a COBOL preprocessor. Cobol85Preprocessor.g4 is at the core of this preprocessor and enables parsing of statements such as COPY REPLACE, EXEC SQL etc.
Cobol85Preprocessor.g4 is meant to be augmented with quite extensive additional logic, which is not included in the grammar files and enables normalization of line formats, line breaks, comment lines, comment entries, EXEC SQL, EXEC CICS and so on. This missing code is leading to the problems you are noticing.
The ProLeap COBOL parser written by me implements all of this in Java based on the files Cobol.g4 and Cobol85Preprocessor.g4. However, there is no Python implementation, yet.
Related
I am creating a npm library where I need to read the files of the folder from where my library function were invoked from command line and then operate on those files.
By operation I mean to check if a variable exist, if a function exists, modifying variable, function,etc.
The files will be a Typescript files.
Any help on how to proceed will be great.
Seems like you need some kind of AST parser like Esprima or babel-parser. These tools can parse the content of JS/TS files, build the abstract syntax tree that can be traversed, modified and converted back to the source code.
There's a lot of useful tools available in Babel toolset that simplifies these operations. For example, babel-traverse simplifies searching the target statement or expression, babel-types that helps to match the type of the AST nodes and babel-generator that generates the source code from the AST.
It's going to be very difficult to get these answers without running the files.
So the best approach is probably to just import the files as usual and see what side-effects running the files had. For example, you can check if a file exported anything.
If this doesn't solve your problem, you will have to parse the files. The best way to do that might be to use the typescript compiler itself:
https://github.com/microsoft/TypeScript/wiki/Using-the-Compiler-API
All I want is to generate API docs from function docstrings in my source code, presumably through sphinx's autodoc extension, to comprise my lean API documentation. My code follows the functional programming paradigm, not OOP, as demonstrated below.
I'd probably, as a second step, add one or more documentation pages for the project, hosting things like introductory comments, code examples (leveraging doctest I guess) and of course linking to the API documentation itself.
What might be a straightforward flow to accomplish documentation from docstrings here? Sphinx is a great popular tool, yet I find its getting started pages a bit dense.
What I've tried, from within my source directory:
$ mkdir documentation
$ sphinx-apidoc -f --ext-autodoc -o documentation .
No error messages, yet this doesn't find (or handle) the docstrings in my source files; it just creates an rst file per source, with contents like follows:
tokenizer module
================
.. automodule:: tokenizer
:members:
:undoc-members:
:show-inheritance:
Basically, my source files look like follows, without much module ceremony or object oriented contents in them (I like functional programming, even though it's python this time around). I've truncated the sample source file below of course, it contains more functions not shown below.
tokenizer.py
from hltk.util import clean, safe_get, safe_same_char
"""
Basic tokenization for text
not supported:
+ forms of pseuod elipsis (...)
support for the above should be added only as part of an automata rewrite
"""
always_swallow_separators = u" \t\n\v\f\r\u200e"
always_separators = ",!?()[]{}:;"
def is_one_of(char, chars):
'''
Returns whether the input `char` is any of the characters of the string `chars`
'''
return chars.count(char)
Or would you recommend a different tool and flow for this use case?
Many thanks!
If you find Sphinx too cumbersome and particular to use for simple projects, try pdoc:
$ pdoc --html tokenizer.py
I am trying to write source code in one language and have it converted to both native c++ and JS source. Ideally the converted source should be human readable and resemble the original source as best it can. I was hoping haxe could solve this problem for me. So I code in haxescript and have it convert it to its corresponding C++ and JS source. However the examples I'm finding of haxe seems to create the final application for you. So with C++ it will use msbuild (or whatever compiler it finds) and creates the final exe for you from generated C++ code. Does haxe also create the c++ and JS source code for you to view or is it all done internally to haxe and not accessible? If it is accessible then is it possible to remove the building side of haxe so it simply creates the source code and stops?
Thanks
When you generate CPP all the intermediate files are generated and kept wherever you decide to generate your output (the path given using -cpp pathToOutput). The fact that you get an executable is probably because you are using the -main switch. That implies an entry point to your application but that is not really required and you can just pass to the command line a bunch of types that you want to have built in your output.
For JS it is very similar, a single JS file is generated and it only has an entry point if you used -main.
Regarding the other topic, does your Haxe code resembles the generated code the answer is yes, but ... some of the types (like Enum and Abstract) only exist in Haxe so they will generate code that functionally works but it might look quite different. Also Haxe has an always-on optimizer/analyzer that might mungle your code in unexpected ways (the analyzer can be disabled). I still find that it is not that difficult to figure out the Haxe source from the generated code. JS has support for source mapping which is really useful for debugging. So in the end, Haxe doesn't do anything to obfuscate your generated code but also doesn't do much to try to preserve it too strictly.
I'm using ANTLRWorks 1.5.2 for grammar creation and subsequent creation of the lexer and parser. I did that for Java target, but my preferred language is python. I'm quite puzzled by this: how can I specify my target language in ANTLRWorks 1.5.2 and get lexer and parser in python? I read somewhere that antlrworks in just for java target.
How can I install antlr3 and use python runtime?
I would be really appriciate if anyone can guid me.
thanks.
If you use the following options {...} block in your grammar:
options {
language=Python;
}
and then press CTRLSHIFT + G from within ANTLRWorks, the *.py lexer and parser files will be generated in the grammar's output/ directory.
However, debugging from within ANTLRWorks only works with the Java target.
As for a complete Python example, checkout this previous Q&A: ANTLR get and split lexer content
I am attempting to install my own GroovyResourceLoader and was wondering if there was an authorative guide somewhere describing all hte moving bits.
Ive noticed when groovy attempts to compile a script, it does attempt to find types by sending paths to the GRL. However it does some strange things sometimes it uses '$' as a separator and other times it uses plain old '.'.
Heres a snapshot of some logging of an attempt to load something. Ignoring the auto import stuff, notice how it using '$' as the package separator and then by replacing each '$' it one at a time with a '.'.
-->a$b$groovy$X$Something
-->a.b$groovy$X$Something
-->a.b.groovy$X$Something
Im using Groovy 1.8.0.
The "$" you see come from Groovy trying to match inner classes. I strongly assume you have somewhere an "a.b.groovy.X.Something" which will lead groovy to try to discover all kinds of inner class combinations for this one. You could for example have a "a$b$groovy$X$Something.groovy" file.