Convert antlr grammar to other format ( i.e xsd, dtd )

Convert antlr grammar to other format ( i.e xsd, dtd ) - xsd

I would like to marshall/unmarshall a grammar of a language using JAXB, so I need a DTD/XSD file. Is it possible to convert ANTLR grammars to these formats? The BNF Converter https://bnfc.digitalgrammars.com/ can generate dtd for LBNF

Related

XSLT 3 : convert xml to json

When I am trying to convert XML to JSON using XSLT3
<xsl:copy-of select="xml-to-json($finalOutPut, map { 'indent' : false() })"/>
I get below error :
net.sf.saxon.s9api.SaxonApiException: xml-to-json: element found in wrong namespace: Q{}wrapper
Basically i am converting one xml to another xml , renaming certain field.
Passing this xml as a input to xml-to-json() .
Any suggestions?

The XML format that xml-to-json consumes is specified both in the XSLT 3.0 specification (https://www.w3.org/TR/xslt-30/#json-to-xml-mapping) as well as in the XPath and XQuery 3.1 function specification: https://www.w3.org/TR/xpath-functions/#json.
Basically all elements need to be in the namespace http://www.w3.org/2005/xpath-functions and are map, array, string, boolean, number etc., to reflect the JSON datatypes.
The error message suggests your input contains an element named wrapper in no namespace, so that is certainly not the right format for that function. You will need to use additional transformation steps to transform your XML to the one the function expects.

Converting ANTLR parse trees into string and then reverting it

I am new to ANTLR, and I am digging into it for a project. My work would require me to generate a parse tree from a source code file, convert the parse tree into a string that holds all the information about the parse tree in a somewhat "human-readable" form. Parts of this string (representing the parse tree) will then be modified, and the modified string will have to be converted to a changed source code.
I have found out that the .toStringTree(tree) method can be used in ANTLR to print out the tree in LISP format. Is there a better way to represent the parse tree as a string that holds all information?
Can the string-parse-tree be reverted back to the original source code (in the same language) using ANTLR? If no, are there any tools for this?

Can the string-parse-tree be reverted back to the original source code (in the same language) using ANTLR?
That string does not contain the token types, just the matched text. In other words: you cannot create a parse tree from the output of the ToStringTree. Besides, many ANTLR grammars have lexer rules that skip certain input (white spaces and line breaks, for example), so converting a parse tree back to the original input source is not always possible.
If no, are there any tools for this?
Without a doubt, I suggest you do a search on GitHub. But when you have the parse tree, it is trivial to create a custom tree structure and convert that to JSON.

Localize token for different languages

Developing a new grammar with ANTLR. My grammar supports basic math and boolean expressions like "4 equals (2 minuses 2)" or "true", "false". All operators are in natural language. I want to support other languages in their nature. For example, "4 equals 4" is "4 ist 4" in German.
What is the best practice to localize tokens and/or expressions?

In our project we follow this structure. There are files FooLexerBase.g and FooLexerLang1.g, FooLexerLang2.g and so on. The base grammar defines common token rules. Tokens that depend on language are not defined in the base, but can be referred to. These tokens are defined in the language-specific grammars, that all also include the base.
So, basically it looks something like this:
FooLexerBase.g:
lexer grammar FooLexerBase;
...
FLOATING_POINT
: DIGIT+ EXPONENT
| DIGIT+ DECIMAL_SEP DIGIT* EXPONENT?
| DECIMAL_SEP DIGIT+ EXPONENT?;
...
DIGIT and EXPONENT are defined in the base, since they are common, while DECIMAL_SEP is language-specific.
For example, FooLexerGerman.g looks like this:
lexer grammar FooLexerGerman;
import base = FooBase;
...
fragment
DECIMAL_SEP: ',';
...
Finally, parser grammar is common for all languages. It is defined this way:
parser grammar FooParser;
options {
tokenVocab = FooLexerBase;
}
...
It is important to not process FooLexerBase with ANTLR, but pass all other grammars through it.
At runtime you build a parser and pass an appropriate lexer as argument to the constructor. I guess it looks more or less the same in any programming language (we use Java).

XML encoding of Attribute in KMIP

I'm analyzing KMIP to implement a prototype in scala. I try so to understand all concepts to implement an architecture for different encoding profiles (bytes, JSON, XML).
In specification section 5.4.1.6 XML Element Encoding, it stipulates :
[...] structure values are encoded as nested xml elements, and non-structure
values are encoded using the ‘value’ attribute
With this example :
<ActivationDate type="DateTime" value="2001-01-01T10:00:00+10:00"/>
I don't understand this syntax since Activation Date is an attribute. In section 2.1.1 Attribute an attribute is described with a structure containing Attribute Name, Attribute Index, Attribute Value.
The XML representation of an ActivationDate or other attributes should be :
<Attribute>
<AttributeName type="TextString" value="Activation Date"/
<AttributeValue type="DateTime" value="2001-01-01T10:00:00+10:00"/>
</Attribute>
Moreover, the KMIP test case uses this second representation.
If the first representation is shown as an example, it will be used. So in which case ?

The KMIP specification is very vague on this point. BOTH forms of Attribute you described are considered valid KMIP and should be handled.
I strongly recommend the KMIP Additional Message Encodings document when implementing http/json/xml encoding- https://docs.oasis-open.org/kmip/kmip-addtl-msg-enc/v1.0/os/kmip-addtl-msg-enc-v1.0-os.html
section 6.1.6 describes yet another format that isn't covered in the main spec: <TTLV tag="0x420001" name="ActivationDate" type="DateTime" value="2001-01-01T10:00:00+10:00"/>

Does ANTLR4 NOT support ASTLabelType?

I'm using ANTLR4 to build AST tree, I download g4 file from: https://github.com/antlr/grammars-v4/tree/master/sqlite
Add the option in the head of g4 file:
options{
output=AST;
ASTLabelType=CommonTree;
language=Java;
}
but while compile g4 file, it output :
ANTLR Tool v4.6 (D:\antlr-4.6-complete.jar)
SQLite.g4 -o C:\Users\macro\workspace\tdsql\target\generated-sources\antlr4 -listener -no-visitor -encoding UTF-8
warning(83): SQLite.g4:34:4: unsupported option output
warning(83): SQLite.g4:35:4: unsupported option ASTLabelType
does antlr4 not support using ASTLabelType to build a AST tree? and how can I build a AST tree with antlr4?

I'm an Antlr newbie myself so there are better-qualified people who can answer this. That said, the AST output option was deprecated between Antlr3 and Antlr4. Antlr3 will generate an AST but Antlr4 won't.
Your alternatives in Antlr4 are to use the Listener pattern (to walk the parse tree) or the Visitor pattern (to visit & evaluate nodes). Either - or both - of those can be used after running the Lexer and Parser.
There are a number of examples that can be found with some searching. Here's one for the Visitor pattern. This page compares Listeners and Visitors.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Convert antlr grammar to other format ( i.e xsd, dtd ) - xsd

I would like to marshall/unmarshall a grammar of a language using JAXB, so I need a DTD/XSD file. Is it possible to convert ANTLR grammars to these formats? The BNF Converter https://bnfc.digitalgrammars.com/ can generate dtd for LBNF

Related

XSLT 3 : convert xml to json

Converting ANTLR parse trees into string and then reverting it

Localize token for different languages

XML encoding of Attribute in KMIP

Does ANTLR4 NOT support ASTLabelType?

Categories

Resources