xtext inferrer: multiple entities - dsl

I am very new to Xtext/Xtend, therefore apologies in advance if the answer is obvious.
I would like to allow the end-users of my DSL to define a 'filter', that when applied and 'returns' true it means that they want to 'filter out' the given entity of data from consideration.
I want to allow them 2 ways of defining the filter
A) by introspecting the attributes of a given data object and apply basic rules like
if (obj.field1<CURRENT_DATE && obj.field2=="EXPIRED)
{ return true;} else {return false;}
B) by executing a controlled snippet using 'eval' of my host language
In other words, the user would be expected to type into a string/code block a valid
code snippet of the hosting language
I had decided that the easiest way for me support case A) would be to leverage the XBase rules (including expressions/etc)
Therefore I defined filters (mostly copying the ideas from Lorenzo's book)
Filter:
(FilterDSL | FilterCode);
FilterDSL:
'filterDSL' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
body=XBlockExpression ;
FilterCode:
'filterCode' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
'{'
body=STRING
'}';
Now when trying to implement the Java mapping for my DSL, via the inferrer stub in Xtend -- I am running into multiple problems.
All of them likely indicate that I am missing some fundamental understanding
Problem 1) fl.body is not defined. fl Is of type Filter, not FilterDSL or FilterCode
And I do not understand how to check what type a given instance is of, so that I can access the content of a 'body' feature.
Problem 2) I do not understand where 'body' attribute in the inferrer method is defined and why. Is this part of ECore? (I could not find it)
Problem 3) what's the proper way to allow a user to specify a code block? String seems to be not the right thing as it does not allow multiline
Problem 4) How do I correctly convert a code block into something that is accepted by the 'body' such that it ends up in the generated code.
Problem 5) How do I setup multiple inferrers (as I have more than one thing for which I need the code generated (mostly) by xBase code generator)
Appreciate in advance any suggestions, or pointer to code examples solving similar problems.
As a side observation, Inferrer and its interplay with XBase has sofar been the most confusing and difficult thing to understand.

in general: have a look at the xtend docs at xtend-lang.org
You can do a if (x instanceof Type) or a switch statement with Type guards (see domain model example)
i dont get that question. both your FilterDSL and FilterCode EClasses should have a field+getter/setter named body, FilterCode of type String, FilterDSL of type XBlockExpression. The JvmTypesBuilder add extension methods to JvmOperation called setBody(String) and setBody(XExpression), syntax sugar lets you call body = .... instead of setBody(...)
(btw you can do crtl+click to find out where a thing is defined)
strings are actually multiline
is answered by (2)
you dont need multiple inferrers, you can infer multiple stuff e.g. by calling toClass or toField multiple times for the same input

Related

Pass Dynamic values to the rules in ANTLR4 grammar

I am newbie to ANTLR4
I want to write a grammar that would parse the syntax using the values which it reads dynamically.
Say my grammar is as follows in image
I need help such the HANDLERID not only takes the values mentioned,but a list of values based on a function call,dynamic values. For example a function return list containing {'ACD','GHY','XYZ' ..}. Not to confuse with identifier,these values are names of some defined set of objects, so writing a grammar for IDENTIFIER is not solution.
Any help is appeciated.
Maybe actions are a viable solution? These are written in the target language and allow to do all kind of processing. Formulated as a predicate (appending a ? to the action block) they can even be used to guide the parser what path to take.
Here's a typical form:
decl: type ID ';' { System.out.println("found a decl"); };
or as a predicate:
HANDLERID: ID { isSpecialWord($ID.text) }?;
which will only be matched for IDs that your internal function isSpecialWord is returning true for. So essentially, you are not passing the lexer rule some values, but you do the evaluation in internal code.

How to read visual studio code intellisense syntax hint, any document for operators?

VSC like VS gives out syntax/signature hint. I understand : means data type
myText: string // : means datatype of myText is string
myStuff: any // any means can be any data type.
sometimes hard to guess what the operators mean, for example the Node's request(),
my understanding is
const request means I can define any variable like const x=request(...) or var x=request(...).
request.RequestAPI means it's an API call.
options: defines this parameter is a typical object-like options in form of {...}
(request.UriOptions & request.CoreOptions) I understand the beginning and end parts, they must be enum of Uri and Core, but what is &? Does it mean I need to supply both Uri AND Core?
| does this pipe mean OR? If it is then it's duplicating the part before the pipe.
callback?: request.RequestCallback, so here I must provide a callback which will be typed (or functioning) as RequestCallback, but what is ?:?
Is there any document for these conventions?
I wanted to comment, because I don't know the complete answer, but here is some helpful information:
You are probably seeing this definition of DefinitelyTyped: https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/request/index.d.ts#L29
Have a look at this to understand the definition file syntax: http://www.typescriptlang.org/docs/handbook/declaration-files/by-example.html
And you can interpret the definition like this:
const request: there is a constant named request that implements the interface request.RequestAPI (which is also callable directly and then for that) takes arguments options of type (request.UriOptions & request.CoreOptions) | (request.UrlOptions & request.CoreOptions) and an optional parameter callback (hence the ? of type request.RequestCallback. The function returns a request.Request.
& usually mean and
A pipe | usually means or, there is no duplication URI vs URL
=> "returns"
You see request in front of everything because it's the namespace (my wording may be off here)
The definitions of UriOptions, UrlOptions, CoreOptions are buried a bit. I'm not a node user, so I don't know what you can pass to request.
For example UrlOptions can either be a string argument named "url" or a url (from require('url')). See https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/request/index.d.ts#L162

How can I pass a structure down a tree (i.e. an inherited attribute) when using Visitor pattern?

I'm using the C++ version of ANTLR4 to develop a DSL for a music product. I used to (30 years ago!) do this kind of thing by hand so it's mostly a pleasure to have something like ANTLR, particularly now that I don't have to insert code in the actual grammar definition itself.
I want to do type checking of actual vs formal args in a function call. In the grammar segment below, the 'actualParameter' can return the type of the expression. However, the 'actualParameterList' needs to return an array (say) of these types so that the code for functionCall can compare to the formal parameter list.
If I was handwriting this, the calls to visit or visitChildren would take an extra parameter after context such that I could create a new array at the appropriate place and then have child nodes fill in the details.
I suppose that instead of just calling visitChildren inside the 'visitActualParameterList' I could create the array there and manually call each child rather than just a simple visitChildren but that feels like a hack, and it becomes very sensitive to minor changes in the grammar.
Is there a better approach?
functionCall: Identifier LeftParen actualParameterList? RightParen
;
actualParameterList:
actualParameter anotherActualParameter
;
actualParameter:
expression
;
anotherActualParameter:
Comma actualParameter anotherActualParameter
|
;
You're on the right path. I would suggest something like:
functionCall: Identifier LPAREN actualParameterList RPAREN
;
actualParameterList:
actualParameter (',' actualParameter)*
;
actualParameter:
expression
;
LPAREN : '(';
RPAREN : ')';
Using this, in the Visitor for actualParameterList you can check each child to see if it's of type actualParameterContext and if so, explicitly call Visit on that child, which will get you into your expression evaluation code (presumably handled in the visitor for actualParameter). This alleviates the need, as you say, to just generically visit children. It's very precise when you can check the type like this.
Here's an example of this pattern from my own code (in C# but surely you'll see the pattern in action):
for (int c = 0; c < context.ChildCount; c++)
{
if (context.GetChild(c) is SystemParser.ServerContext) // make sure correct type
{
string serverinfo = Visit(context.GetChild(c)); // visit the specific child and save return value, string in this case
sb.Append(serverinfo); // use result to fill array or do whatever
}
}
Now that you can see the pattern, back to your code. The syntax:
actualParameter (',' actualParameter)*
means that a parameter list has one actualParameter followed by zero or more additional ones with the * operator. I just threw the comma in there for visual clarity.
As you suggest, Visitor is the perfect pattern for this because you can explicitly visit any node you need to. It won't give you an array, but you can fill an array or any other necessary structure with the results of the visiting the children as you saw in the snip from my code. My Visitor returns strings, and I just appended to a StringBuilder. You can use the same pattern to build whatever you need.

what does getType do in antlr4?

This question is with reference to the Cymbol code from the book (~ page 143) :
int t = ctx.type().start.getType(); // in DefPhase.enterFunctionDecl()
Symbol.Type type = CheckSymbols.getType(t);
What does each component return: "ctx.type()", "start", "getType()" ? The book does not contain any explanation about these names.
I can "kind of" understand that "ctx.type()" refers to the "type" rule, and "getType()" returns the number associated with it. But what exactly does the "start" do?
Also, to generalize this question: what is the mechanism to get the value/structure returned by a rule - especially in the context of usage in a listener?
I can see that for an ID, it is:
String name = ctx.ID().getText();
And as in above, for an enumeration of keywords it is via "start.getType()". Any other special kinds of access that I should be aware of?
Lets disassemble problem step by step. Obviously, ctx is instance of CymbolParser.FunctionDeclContext. On page 98-99 you can see how grammar and ParseTree are implemented (at least the feeling - for real implementation please see th .g4 file).
Take a look at the figure of AST on page 99 - you can see that node FunctionDeclContext has a several children, one labeled type. Intuitively you see that it somehow correspond with function return-type. This is the node you retrieve when calling CymbolParser.FunctionDeclContext::type. The return type is probably sth like TypeContext.
Note that methods without 'get' at the beginning are usually children-getters - e.g. you can access the block by calling CymbolParser.FunctionDeclContext::block.
So you got the type context of the method you got passed. You can call either begin or end on any context to get first of last Token defining the context. Simply start gets you "the first word". In this case, the first Token is of course the function return-type itsef, e.g. int.
And the last call - Token::getType returns integral representation of Token.
You can find more information at API reference webpages - Context, Token. But the best way of understanding the behavior is reading through the generated ANTLR classes such as <GrammarName>Parser etc. And to be complete, I attach a link to the book.

should it be allowed to change the method signature in a non statically typed language

Hypothetic and academic question.
pseudo-code:
<pre><code>
class Book{
read(theReader)
}
class BookWithMemory extends Book {
read(theReader, aTimestamp = null)
}
</pre></code>
Assuming:
an interface (if supported) would prohibit it
default value for parameters are supported
Notes:
PHP triggers an strict standards error for this.
I'm not surprised that PHP strict mode complains about such an override. It's very easy for a similar situation to arise unintentionally in which part of a class hierarchy was edited to use a new signature and a one or a few classes have fallen out of sync.
To avoid the ambiguity, name the new method something different (for this example, maybe readAt?), and override read to call readAt in the new class. This makes the intent plain to the interpreter as well as anyone reading the code.
The actual behavior in such a case is language-dependent -- more specifically, it depends on how much of the signature makes up the method selector, and how parameters are passed.
If the name alone is the selector (as in PHP or Perl), then it's down to how the language handles mismatched method parameter lists. If default arguments are processed at the call site based on the static type of the receiver instead of at the callee's entry point, when called through a base class reference you'd end up with an undefined argument value instead of your specified default, similarly to what would happen if there was no default specified.
If the number of parameters (with or without their types) are part of the method selector (as in Erlang or E), as is common in dynamic languages that run on JVM or CLR, you have two different methods. Create a new overload taking additional arguments, and override the base method with one that calls the new overload with default argument values.
If I am reading the question correctly, this question seems very language specific (as in it is not applicable to all dynamic languages), as I know you can do this in ruby.
class Book
def read(book)
puts book
end
end
class BookWithMemory < Book
def read(book,aTimeStamp = nil)
super book
puts aTimeStamp
end
end
I am not sure about dynamic languages besides ruby. This seems like a pretty subjective question as well, as at least two languages were designed on either side of the issue (method overloading vs not: ruby vs php).

Resources