Ways of keeping ANTLR4 grammar target independent - antlr4

I'm writing a grammar for C++ target, however I'd like to keep it working with Java as well since ANTLR comes with great tools that work for grammars with Java target. The book ("The Definitive ANTLR 4 Reference") says that the way of achieving target independence is to use listeners and/or visitors. There is one problem though. Any predicate, local variable, custom constructor, custom token class etc. that I might need introduces target language dependence that cannot be removed, at least according to the information I took from the book. Since the book might be outdated here are the questions:
Is there a way of declaring primitive variables in language independent way, something like:
item[$bool hasAttr]
:
type ( { $hasAttr }? attr | ) ID
;
where $bool would be translated to bool in C++, but to boolean in Java (workaround would be to use int in that case but most likely not in all potential targets)
Is there a way of declaring certain code fragments to be for specific target only, something like:
parser grammar testParser;
options
{
tokenVocab=testLexer;
}
#header
<lang=Cpp>{
#include "utils/helper.h"
}
<lang=Java>{
import test.utils.THelper;
}
#members
<lang=Cpp>{
public:
testParser(antlr4::TokenStream *input, utils::THelper *helper);
private:
utils::THelper *Helper;
public:
}
<lang=Java>{
public testParser(TokenStream input, THelper helper) {
this(input);
Helper = helper;
}
private THelper Helper;
}
start
:
(
<lang=Cpp>{ Helper->OnUnitStart(this); }
<lang=Java>{ Helper.OnUnitStart(this); }
unit
<lang=Cpp>{ _localctx = Helper->OnUnitEnd(this); }
<lang=Java>{ _localctx = Helper.OnUnitEnd(this); }
)*
EOF
;
...
For the time being I'm keeping two separate grammars changing the Java one and merging the changes to C++ one once I'm happy with the results, but if possible
I'd rather keep it in one file.

This target dependency is a real nuisance and I'm thinking for a while already how to get rid of that in a good way. Haven't still found something fully usable.
What you can do is to stay with syntax that both Java and C++ can understand (e.g. write a predicate like a function call: a: { isValid() }? b c; and implement such functions in a base class from which you derive your parser (ANTLR allows to specify such a base class via the grammar option superClass).
The C++ target also got a number of additional named actions which you can use to specify C++ specific stuff only.

Related

How to define a nested map in a header file for use in .cpp

I am trying to define a nested map variable in a header file to use for key value lookup (or key, key value lookup since it is nested).
Apologies for being very new to C++ in general, let alone C++98.
I have intermediate JavaScript experience, which might explain difficulties/habits.
I'm trying to insert spoken language translations into a UI using a nested map, something with a structure similar to this:
phrases["english"]["hello"] = "hi";
phrases["spanish"]["hello"] = "hola";
which will allow me to use phrases[selectedLanguage]["hello"] which will return(?) "hi" or "hola" depending on what selectedLanguage is set to.
This is so that a user can switch between languages while also allowing me to just change one translations.h file if/when needed.
I have a working version of the code which puts the map definitions within the .cpp code but I'd like to create something like a header file which defines my 'phrases' map variable so that I can separate language translations from the rest of the .cpp code.
My current working code looks like this:
UI.cpp:
void CScnMgr::InitScreens(){
// selectedLanguage is defined
string selectedLanguage = "spanish";
//phrases map is defined
map <string, map <string, string> > phrases;
phrases["english"]["hello"] = "hi";
phrases["spanish"]["hello"] = "hola";
// then later when i need to use either translation...
phrases[selectedLanguage]["hello"];
}
This works, but I assume this is bad practice because it is creating this object every time the screens are initialized and for reasons I'm unfamiliar with. But I want to put my phrases map into a header file.
This is giving me errors:
translations.h:
#include <string>
#include <map>
int main(){
map <string, map <string, string> > newPhrases;
map <string, string> spanish;
map <string, string> english;
spanish["hello"] = "hola";
english["hello"] = "hi";
newPhrases["spanish"] = spanish;
newPhrases["english"] = english;
return 0;
}
UI.cpp:
#include "translations.h"
void CScnMgr::InitScreens(){
int extern newPhrases;
// further down where I need to display to the UI...
newPhrases[selectedLanguage]["hi"]
}
Errors:
UI.cpp: error: no match for 'operator[]' in 'newPhrases[selectedLanguage]'
I certainly don't understand why putting "int" in 'int extern newPhrases' passes compiling, but that's why it is there, I gave it the type of the main() return. I don't feel very comfortable doing that.
So I've defined selectedLanguage as "english" so I would expect C++ to handle that as newPhrases["english"], but it seems like newPhrases isn't defined as I expect it to be after importing it from translations.h
I'd appreciate a way to make this code work but I'd also appreciate reasons why this is the wrong way to go about this. Thanks in advance!
Let's try this step by step:
JavaScript to C++
That's quite a brave task :)
I guess you chose the hard path going that way. It would have been easier the other way round. Well... it is what it is. Just let me say: C++ feels very different than JavaScript. I strongly recommend to do one of the myriads of tutorials and/or read a good book about it. There are plenty!
File Structure
Generally speaking, there should never be definitions in header files, only declarations. If you want to know more about this, google is your friend.
What you can do is having a declaration in the header file (using the keyword extern or by putting it into a class) and a definition in a (separate) cpp file. The linker will then find that definition and link the output together.
OOP
I strongly recommend to familiarize yourself with the OO concept. It will probably help you in the long run, and there might be more elegant solutions for your problem, but I won't go into detail here, see the other headings.
Analysis of your current code
This works, but I assume this is bad practice because it is creating this object every time the screens are initialized and for reasons I'm unfamiliar with. But I want to put my phrases map into a header file.
The problem is that this object as you have it now is living on the stack and will be soon destroyed (overwritten) when you leave the function. So it won't work if you want to access phrases from a different function. You can read more about object lifetime here and a little more about how scope is connected to lifetime in the first link that popped up in google.
This is giving me errors:
translations.h:
#include <string>
#include <map>
int main(){
map <string, map <string, string> > newPhrases;
map <string, string> spanish;
map <string, string> english;
spanish["hello"] = "hola";
english["hello"] = "hi";
newPhrases["spanish"] = spanish;
newPhrases["english"] = english;
return 0;
}
Best practice is to not implement your functions in header files, but only declare them there and implement them in cpp files. For main(), you don't need a declaration. Just use a cpp file.
The other thing is that you are creating newPhrases on the stack of main(), so newPhrases also only lives while main() is running. Probably not what you want.
UI.cpp:
#include "translations.h"
void CScnMgr::InitScreens(){
int extern newPhrases;
// further down where I need to display to the UI...
newPhrases[selectedLanguage]["hi"]
}
Errors:
UI.cpp: error: no match for 'operator[]' in 'newPhrases[selectedLanguage]'
int extern newPhrases is just a declaration. It tells the compiler that there is something named newPhrases somewhere (but not here) and that it is of type int. Actually you would want to tell the compiler that this thing is of type map<string, map<string, string> >. Besides, extern declarations should not be inside functions. The error itself comes from your extern declaration. The compiler thinks that newPhrases is of type int, but something of type int doesn't have a square-bracket-operator (operator[]). But even if you fixed that it would not run, so I won't go into details how to get something possibly working. (See some suggestions and links in the next section)
The general approach about localization / internationalization / multi language support
The mindset about having the desire to abstract the language and splitting it from your code is good. The question is now how to solve it. A central idiom in programming in general is "not to reinvent the wheel".
Basically, I consider your question a duplicate to this one:
C++, Multilanguage/Localisation support
Another very similar topic that I found:
Bests practices for localized texts in C++ cross-platform applications?
Yet another one:
How to support multiple language in a Linux C/C++ program?
If you want to stick to your approach, have a look at this one (performance):
C++ map<std::string> vs map<char *> performance (I know, "again?")
In the posts mentioned above, there are many suggestions, also frameworks that can handle your problem pretty well. I also recommend to do your own research because some of the questions are rather old and there might be some new stuff already. Just ask your loyal friend

Transpile an algorithm with user-suppliable callbacks into target language

I currently transpile the control flow modeled in an SCXML state-chart onto an ANSI-C algorithm which calls a series of user-supplied callback functions in the correct order, effectively realizing the control flow from the state-chart or ANSI-C. Seeing that more target languages may eventually follow, I was thinking about transpiling onto Haxe as a quasi-canonical form and use their transpilation capabilities to target other languages.
Seeing that Haxe is inherently object-oriented, I guess the best way would be to generate an abstract base class with the transpiled algorithm, which would be extended with implementations of the callbacks.
However, looking at Haxe it seems that this is a rather unorthodox usage and I am at a loss how best to approach it. I cannot find native callbacks in the target language agnostic part of Haxe, so I guess it boils down to target language specific approaches anyway?
Update: I want to invoke user-supplied callbacks in the target language. The state-chart here merely formalizes a certain control flow. There is no XML parsing involved in Haxe at all, I already parse the XML, process it and generate ANSI-C which accepts user-supplied callbacks. Now I want to take a detour via Haxe to generate any target-language, still, the user-supplied callbacks and all the "scaffolding" is in the target-language.
If you only need one listener per callback then just use a function per callback, I prefer to not anticipate what data the listener needs.
Run js example here: https://try.haxe.org/#2a6d2
code below for completeness.
class Test {
static function main() { new Test(); }
var testing: WithCallback;
public function new(){
testing = new WithCallback( output );
testing.start();
}
public function output(){
trace( 'test ' + testing.val );
}
}
class WithCallback{
public var cb: Void->Void;
public var val: String;
public function new( cb_: Void -> Void ){
cb = cb_;
}
public function start(){
for( i in 0...100 ){
val = 'callback counting ' + Std.string( i );
if( cb != null ) cb();
}
}
}
If you want multiple objects to listen then you could look at some Signal type library, I believe Tink ( macro library ) provides one but not tried it.
https://github.com/haxetink/tink_core/blob/master/src/tink/core/Signal.hx
There must be a few signals libraries around.
https://code.google.com/archive/p/hxs/
https://github.com/massiveinteractive/msignal
You may also want to look at how Json can be autoparsed in Haxe with abstracts and typedef using stuff like '#to' and '#from'.
https://haxe.org/manual/std-Json-parsing.html
So for instance a nice way to parse some json with Time field in - is to parse to a typedef with the time field using an abstract around a string and add a method within the abstract so you can get the type from the abstract.
https://haxe.org/manual/types-abstract-implicit-casts.html
I think others have worked on a similar approach for xml parsing but if you look into the internals of haxe.Json.parse I am sure you could create a similar approach for xml or binary feeds ( not sure if franco's streams stuff is relevant ). Also there is an approach to get haxe to generate the typedef code for json parsing based on a sample using really smart macros, but I guess it would be very hard to get haxe macros to construct parser based on data example.
Also you should have a look in the format library it has many example of parsing data.
https://github.com/HaxeFoundation/format

checker framework: Supress Warnings in default constructor

I have two constructors: normal ctor which initialises the object properly and a second default ctor for Hibernate which generates initialize fields warnings. What's the preferred way to get rid of the warnings?
package test;
public class Example {
String x;
public Example(String x) {
this.x = x;
}
Example() {
// Ctor for Hibernate, warnings generated here.
}
}
You didn't mention looking in the documentation, so I'm not sure whether you have done so. The Checker Framework manual contains a chapter titled "Suppressing warnings", which might contain all the information you need.
The most common approach is to write a #SuppressingWarnings annotation, which is the standard way to suppress warnings from the Java compiler.
You should write it on the smallest program element possible (such as a local variable declaration rather than the whole constructor or class), and you should supply the most specific key possible. The reason is to avoid accidentally suppressing more warnings than you intended.

Do we need this keyword in .net 4.0 or 4.5

I am currently reviewing code written in c#, visual studio 2012.
In lot of places, the code is written using this key word, for ex:
this.pnlPhoneBasicDtls.Visible = true;
this.SetPhAggStats(oStats);
There are many other places where the controls of the page are referred using this key word.
Can somebody advise do we really need to use this here?
Any consequences of removing this keyword?
Thanks in advance..
No, "this" is optional. It's usually included in code generated by a tool and by people who feel the need to be explicit or who want to differentiate it from an argument to the method.
Its Optional you can use the
Property directly like pnlPhoneBasicDtls.Visible = true;
The this keyword is usually optional.
It's sometimes used to disambiguate fields from arguments if the same name is being used for both, for example:
void Main()
{
var sc = new SomeClass();
sc.SomeMethod(123);
Console.WriteLine(sc.thing);
}
public class SomeClass
{
public int thing;
public void SomeMethod(int thing)
{
this.thing = thing + 1;
}
}
In the example above it does make a difference. Inside SomeMethod, this.thing refers to the field and thing refers to the argument.
(Note that the simpler assignment thing = thing is picked up as a compiler error, since it is a no-op.)
Of course, if you use ReSharper then any unnecessary this. (together with unused using statements, unreachable code, etc.) will be greyed out and you can remove them very quickly. The same is probably true of similar tools like CodeRush.

ServiceStack: RESTful Resource Versioning

I've taken a read to the Advantages of message based web services article and am wondering if there is there a recommended style/practice to versioning Restful resources in ServiceStack? The different versions could render different responses or have different input parameters in the Request DTO.
I'm leaning toward a URL type versioning (i.e /v1/movies/{Id}), but I have seen other practices that set the version in the HTTP headers (i.e Content-Type: application/vnd.company.myapp-v2).
I'm hoping a way that works with the metadata page but not so much a requirement as I've noticed simply using folder structure/ namespacing works fine when rendering routes.
For example (this doesn't render right in the metadata page but performs properly if you know the direct route/url)
/v1/movies/{id}
/v1.1/movies/{id}
Code
namespace Samples.Movies.Operations.v1_1
{
[Route("/v1.1/Movies", "GET")]
public class Movies
{
...
}
}
namespace Samples.Movies.Operations.v1
{
[Route("/v1/Movies", "GET")]
public class Movies
{
...
}
}
and corresponding services...
public class MovieService: ServiceBase<Samples.Movies.Operations.v1.Movies>
{
protected override object Run(Samples.Movies.Operations.v1.Movies request)
{
...
}
}
public class MovieService: ServiceBase<Samples.Movies.Operations.v1_1.Movies>
{
protected override object Run(Samples.Movies.Operations.v1_1.Movies request)
{
...
}
}
Try to evolve (not re-implement) existing services
For versioning, you are going to be in for a world of hurt if you try to maintain different static types for different version endpoints. We initially started down this route but as soon as you start to support your first version the development effort to maintain multiple versions of the same service explodes as you will need to either maintain manual mapping of different types which easily leaks out into having to maintain multiple parallel implementations, each coupled to a different versions type - a massive violation of DRY. This is less of an issue for dynamic languages where the same models can easily be re-used by different versions.
Take advantage of built-in versioning in serializers
My recommendation is not to explicitly version but take advantage of the versioning capabilities inside the serialization formats.
E.g: you generally don't need to worry about versioning with JSON clients as the versioning capabilities of the JSON and JSV Serializers are much more resilient.
Enhance your existing services defensively
With XML and DataContract's you can freely add and remove fields without making a breaking change. If you add IExtensibleDataObject to your response DTO's you also have a potential to access data that's not defined on the DTO. My approach to versioning is to program defensively so not to introduce a breaking change, you can verify this is the case with Integration tests using old DTOs. Here are some tips I follow:
Never change the type of an existing property - If you need it to be a different type add another property and use the old/existing one to determine the version
Program defensively realize what properties don't exist with older clients so don't make them mandatory.
Keep a single global namespace (only relevant for XML/SOAP endpoints)
I do this by using the [assembly] attribute in the AssemblyInfo.cs of each of your DTO projects:
[assembly: ContractNamespace("http://schemas.servicestack.net/types",
ClrNamespace = "MyServiceModel.DtoTypes")]
The assembly attribute saves you from manually specifying explicit namespaces on each DTO, i.e:
namespace MyServiceModel.DtoTypes {
[DataContract(Namespace="http://schemas.servicestack.net/types")]
public class Foo { .. }
}
If you want to use a different XML namespace than the default above you need to register it with:
SetConfig(new EndpointHostConfig {
WsdlServiceNamespace = "http://schemas.my.org/types"
});
Embedding Versioning in DTOs
Most of the time, if you program defensively and evolve your services gracefully you wont need to know exactly what version a specific client is using as you can infer it from the data that is populated. But in the rare cases your services needs to tweak the behavior based on the specific version of the client, you can embed version information in your DTOs.
With the first release of your DTOs you publish, you can happily create them without any thought of versioning.
class Foo {
string Name;
}
But maybe for some reason the Form/UI was changed and you no longer wanted the Client to use the ambiguous Name variable and you also wanted to track the specific version the client was using:
class Foo {
Foo() {
Version = 1;
}
int Version;
string Name;
string DisplayName;
int Age;
}
Later it was discussed in a Team meeting, DisplayName wasn't good enough and you should split them out into different fields:
class Foo {
Foo() {
Version = 2;
}
int Version;
string Name;
string DisplayName;
string FirstName;
string LastName;
DateTime? DateOfBirth;
}
So the current state is that you have 3 different client versions out, with existing calls that look like:
v1 Release:
client.Post(new Foo { Name = "Foo Bar" });
v2 Release:
client.Post(new Foo { Name="Bar", DisplayName="Foo Bar", Age=18 });
v3 Release:
client.Post(new Foo { FirstName = "Foo", LastName = "Bar",
DateOfBirth = new DateTime(1994, 01, 01) });
You can continue to handle these different versions in the same implementation (which will be using the latest v3 version of the DTOs) e.g:
class FooService : Service {
public object Post(Foo request) {
//v1:
request.Version == 0
request.Name == "Foo"
request.DisplayName == null
request.Age = 0
request.DateOfBirth = null
//v2:
request.Version == 2
request.Name == null
request.DisplayName == "Foo Bar"
request.Age = 18
request.DateOfBirth = null
//v3:
request.Version == 3
request.Name == null
request.DisplayName == null
request.FirstName == "Foo"
request.LastName == "Bar"
request.Age = 0
request.DateOfBirth = new DateTime(1994, 01, 01)
}
}
Framing the Problem
The API is the part of your system that exposes its expression. It defines the concepts and the semantics of communicating in your domain. The problem comes when you want to change what can be expressed or how it can be expressed.
There can be differences in both the method of expression and what is being expressed. The first problem tends to be differences in tokens (first and last name instead of name). The second problem is expressing different things (the ability to rename oneself).
A long-term versioning solution will need to solve both of these challenges.
Evolving an API
Evolving a service by changing the resource types is a type of implicit versioning. It uses the construction of the object to determine behavior. Its works best when there are only minor changes to the method of expression (like the names). It does not work well for more complex changes to the method of expression or changes to the change of expressiveness. Code tends to be scatter throughout.
Specific Versioning
When changes become more complex it is important to keep the logic for each version separate. Even in mythz example, he segregated the code for each version. However, the code is still mixed together in the same methods. It is very easy for code for the different versions to start collapsing on each other and it is likely to spread out. Getting rid of support for a previous version can be difficult.
Additionally, you will need to keep your old code in sync to any changes in its dependencies. If a database changes, the code supporting the old model will also need to change.
A Better Way
The best way I've found is to tackle the expression problem directly. Each time a new version of the API is released, it will be implemented on top of the new layer. This is generally easy because changes are small.
It really shines in two ways: first all the code to handle the mapping is in one spot so it is easy to understand or remove later and second it doesn't require maintenance as new APIs are developed (the Russian doll model).
The problem is when the new API is less expressive than the old API. This is a problem that will need to be solved no matter what the solution is for keeping the old version around. It just becomes clear that there is a problem and what the solution for that problem is.
The example from mythz's example in this style is:
namespace APIv3 {
class FooService : RestServiceBase<Foo> {
public object OnPost(Foo request) {
var data = repository.getData()
request.FirstName == data.firstName
request.LastName == data.lastName
request.DateOfBirth = data.dateOfBirth
}
}
}
namespace APIv2 {
class FooService : RestServiceBase<Foo> {
public object OnPost(Foo request) {
var v3Request = APIv3.FooService.OnPost(request)
request.DisplayName == v3Request.FirstName + " " + v3Request.LastName
request.Age = (new DateTime() - v3Request.DateOfBirth).years
}
}
}
namespace APIv1 {
class FooService : RestServiceBase<Foo> {
public object OnPost(Foo request) {
var v2Request = APIv2.FooService.OnPost(request)
request.Name == v2Request.DisplayName
}
}
}
Each exposed object is clear. The same mapping code still needs to be written in both styles, but in the separated style, only the mapping relevant to a type needs to be written. There is no need to explicitly map code that doesn't apply (which is just another potential source of error). The dependency of previous APIs is static when you add future APIs or change the dependency of the API layer. For example, if the data source changes then only the most recent API (version 3) needs to change in this style. In the combined style, you would need to code the changes for each of the APIs supported.
One concern in the comments was the addition of types to the code base. This is not a problem because these types are exposed externally. Providing the types explicitly in the code base makes them easy to discover and isolate in testing. It is much better for maintainability to be clear. Another benefit is that this method does not produce additional logic, but only adds additional types.
I am also trying to come with a solution for this and was thinking of doing something like the below. (Based on a lot of Googlling and StackOverflow querying so this is built on the shoulders of many others.)
First up, I don’t want to debate if the version should be in the URI or Request Header. There are pros/cons for both approaches so I think each of us need to use what meets our requirements best.
This is about how to design/architecture the Java Message Objects and the Resource Implementation classes.
So let’s get to it.
I would approach this in two steps. Minor Changes (e.g. 1.0 to 1.1) and Major Changes (e.g 1.1 to 2.0)
Approach for minor changes
So let’s say we go by the same example classes used by #mythz
Initially we have
class Foo { string Name; }
We provide access to this resource as /V1.0/fooresource/{id}
In my use case, I use JAX-RS,
#Path("/{versionid}/fooresource")
public class FooResource {
#GET
#Path( "/{id}" )
public Foo getFoo (#PathParam("versionid") String versionid, (#PathParam("id") String fooId)
{
Foo foo = new Foo();
//setters, load data from persistence, handle business logic etc
Return foo;
}
}
Now let’s say we add 2 additional properties to Foo.
class Foo {
string Name;
string DisplayName;
int Age;
}
What I do at this point is annotate the properties with a #Version annotation
class Foo {
#Version(“V1.0")string Name;
#Version(“V1.1")string DisplayName;
#Version(“V1.1")int Age;
}
Then I have a response filter that will based on the requested version, return back to the user only the properties that match that version. Note that for convenience, if there are properties that should be returned for all versions, then you just don’t annotate it and the filter will return it irrespective of the requested version
This is sort of like a mediation layer. What I have explained is a simplistic version and it can get very complicated but hope you get the idea.
Approach for Major Version
Now this can get quite complicated when there is a lot of changes been done from one version to another. That is when we need to move to 2nd option.
Option 2 is essentially to branch off the codebase and then do the changes on that code base and host both versions on different contexts. At this point we might have to refactor the code base a bit to remove version mediation complexity introduced in Approach one (i.e. make the code cleaner) This might mainly be in the filters.
Note that this is just want I am thinking and haven’t implemented it as yet and wonder if this is a good idea.
Also I was wondering if there are good mediation engines/ESB’s that could do this type of transformation without having to use filters but haven’t seen any that is as simple as using a filter. Maybe I haven’t searched enough.
Interested in knowing thoughts of others and if this solution will address the original question.

Resources