ANTLR 4 : How to know the existence of subpart in rule - antlr4

I have this code :
varDeclaration
: type ID ('=' expression)? ';'
;
So, not always ('=' expression) exist. But, sometimes, I want to process this part, but don't know it exist or not in this context. I'm using ANTLR 4 (and often using Listener), how can I know this.
Thanks :)

In your listener (exitVarDeclaration) or visitor (visitVarDeclaration) check whether ctx.expression() == null. If null, then ('=' expression) didn't exist. If non-null, then it did exist.

Related

How to handle Optional Grammar Blocks with a ANTLR-Visitor?

It is possible that this question has been asked before but i cannot find it. So if you guys find something similar, please let me know.
According to the following Rule:
fix_body : ident binders (annotation)? (':' term)? ':=' fix_body_term;
I have an optional annotation and an optional Term. The corresponding visitorRule looks like this :
public FixBody visitFix_body(coqParser.Fix_bodyContext ctx)
My Question is how do i find out, if there was a term or not?
There is a method for reaching the term by using ctx.term(), but when there is no term given, does this method return null? Or is there a completly different way to approach this? As i am working with a large grammer it will take me a while to test this, otherwise I would have done that.
There is no trap there ...
If the term is optional, you just have to test it before calling the accept(visitor) method
In your case
if(ctx.term() != null) ctx.term().accept(new TermVisitor())
Example:
Optional in a grammar
Testing before accepting the visitor

How can I pass a structure down a tree (i.e. an inherited attribute) when using Visitor pattern?

I'm using the C++ version of ANTLR4 to develop a DSL for a music product. I used to (30 years ago!) do this kind of thing by hand so it's mostly a pleasure to have something like ANTLR, particularly now that I don't have to insert code in the actual grammar definition itself.
I want to do type checking of actual vs formal args in a function call. In the grammar segment below, the 'actualParameter' can return the type of the expression. However, the 'actualParameterList' needs to return an array (say) of these types so that the code for functionCall can compare to the formal parameter list.
If I was handwriting this, the calls to visit or visitChildren would take an extra parameter after context such that I could create a new array at the appropriate place and then have child nodes fill in the details.
I suppose that instead of just calling visitChildren inside the 'visitActualParameterList' I could create the array there and manually call each child rather than just a simple visitChildren but that feels like a hack, and it becomes very sensitive to minor changes in the grammar.
Is there a better approach?
functionCall: Identifier LeftParen actualParameterList? RightParen
;
actualParameterList:
actualParameter anotherActualParameter
;
actualParameter:
expression
;
anotherActualParameter:
Comma actualParameter anotherActualParameter
|
;
You're on the right path. I would suggest something like:
functionCall: Identifier LPAREN actualParameterList RPAREN
;
actualParameterList:
actualParameter (',' actualParameter)*
;
actualParameter:
expression
;
LPAREN : '(';
RPAREN : ')';
Using this, in the Visitor for actualParameterList you can check each child to see if it's of type actualParameterContext and if so, explicitly call Visit on that child, which will get you into your expression evaluation code (presumably handled in the visitor for actualParameter). This alleviates the need, as you say, to just generically visit children. It's very precise when you can check the type like this.
Here's an example of this pattern from my own code (in C# but surely you'll see the pattern in action):
for (int c = 0; c < context.ChildCount; c++)
{
if (context.GetChild(c) is SystemParser.ServerContext) // make sure correct type
{
string serverinfo = Visit(context.GetChild(c)); // visit the specific child and save return value, string in this case
sb.Append(serverinfo); // use result to fill array or do whatever
}
}
Now that you can see the pattern, back to your code. The syntax:
actualParameter (',' actualParameter)*
means that a parameter list has one actualParameter followed by zero or more additional ones with the * operator. I just threw the comma in there for visual clarity.
As you suggest, Visitor is the perfect pattern for this because you can explicitly visit any node you need to. It won't give you an array, but you can fill an array or any other necessary structure with the results of the visiting the children as you saw in the snip from my code. My Visitor returns strings, and I just appended to a StringBuilder. You can use the same pattern to build whatever you need.

Swift Switch case: Default will never be executed warning

On Xcode 7b2 with Swift 2 code, I have following:
In a switch case the compiler returns the following warning :
Default will never be executed
The code :
switch(type) {
case .foo:
return "foo"
case .bar:
return "bar"
case .baz:
return "baz"
default:
return "?"
}
Why would there be a warning ?
I just understood why :
The object I "switched" on is an enum and my enum only has 3 entries : .foo, .bar, baz.
The compiler gets that there is no need of a default because every possibility of the enum gets tested.
I think this warning violates the open-closed principle. When you add an enum value later, the default will be missing, and you can't predict what your code will do. So you have to change also this place. Anyway, using switch() at all violates this principle.
This could be because type is a enum with 3 cases and the compiler knows that the switch statement is exhaustive so you don't need a default statement in order to handle all possible cases.

xtext inferrer: multiple entities

I am very new to Xtext/Xtend, therefore apologies in advance if the answer is obvious.
I would like to allow the end-users of my DSL to define a 'filter', that when applied and 'returns' true it means that they want to 'filter out' the given entity of data from consideration.
I want to allow them 2 ways of defining the filter
A) by introspecting the attributes of a given data object and apply basic rules like
if (obj.field1<CURRENT_DATE && obj.field2=="EXPIRED)
{ return true;} else {return false;}
B) by executing a controlled snippet using 'eval' of my host language
In other words, the user would be expected to type into a string/code block a valid
code snippet of the hosting language
I had decided that the easiest way for me support case A) would be to leverage the XBase rules (including expressions/etc)
Therefore I defined filters (mostly copying the ideas from Lorenzo's book)
Filter:
(FilterDSL | FilterCode);
FilterDSL:
'filterDSL' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
body=XBlockExpression ;
FilterCode:
'filterCode' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
'{'
body=STRING
'}';
Now when trying to implement the Java mapping for my DSL, via the inferrer stub in Xtend -- I am running into multiple problems.
All of them likely indicate that I am missing some fundamental understanding
Problem 1) fl.body is not defined. fl Is of type Filter, not FilterDSL or FilterCode
And I do not understand how to check what type a given instance is of, so that I can access the content of a 'body' feature.
Problem 2) I do not understand where 'body' attribute in the inferrer method is defined and why. Is this part of ECore? (I could not find it)
Problem 3) what's the proper way to allow a user to specify a code block? String seems to be not the right thing as it does not allow multiline
Problem 4) How do I correctly convert a code block into something that is accepted by the 'body' such that it ends up in the generated code.
Problem 5) How do I setup multiple inferrers (as I have more than one thing for which I need the code generated (mostly) by xBase code generator)
Appreciate in advance any suggestions, or pointer to code examples solving similar problems.
As a side observation, Inferrer and its interplay with XBase has sofar been the most confusing and difficult thing to understand.
in general: have a look at the xtend docs at xtend-lang.org
You can do a if (x instanceof Type) or a switch statement with Type guards (see domain model example)
i dont get that question. both your FilterDSL and FilterCode EClasses should have a field+getter/setter named body, FilterCode of type String, FilterDSL of type XBlockExpression. The JvmTypesBuilder add extension methods to JvmOperation called setBody(String) and setBody(XExpression), syntax sugar lets you call body = .... instead of setBody(...)
(btw you can do crtl+click to find out where a thing is defined)
strings are actually multiline
is answered by (2)
you dont need multiple inferrers, you can infer multiple stuff e.g. by calling toClass or toField multiple times for the same input

Why don't we have two nulls?

I've often wondered why languages with a null representing "no value" don't differentiate between the passive "I don't know what the value is" and the more assertive "There is no value.".
There have been several cases where I'd have liked to differentiate between the two (especially when working with user-input and databases).
I imagine the following, where we name the two states unknown and null:
var apple;
while (apple is unknown)
{
askForApple();
}
if (apple is null)
{
sulk();
}
else
{
eatApple(apple);
}
Obviously, we can get away without it by manually storing the state somwhere else, but we can do that for nulls too.
So, if we can have one null, why can't we have two?
Isn't is bad enough that we have one null?
In my programming, I recently adopted the practice of differentiating "language null" and "domain null".
The "language null" is the special value that is provided by the programming language to express that a variable has "no value". It is needed as dummy value in data structures, parameter lists, and return values.
The "domain null" is any number of objects that implement the NullObject design pattern. Actually, you have one distinct domain null for each domain context.
It is fairly common for programmers to use the language null as a catch-all domain null, but I have found that it tends to make code more procedural (less object oriented) and the intent harder to discern.
Every time to want a null, ask yourself: is that a language null, or a domain null?
In most programming languages null means "empty" or "undefined". "Unknown" on the other hand is something different. In essence "unknown" describes the state of the object. This state must have come from somewhere in your program.
Have a look at the Null Object pattern. It may help you with what you are trying to achieve.
javascript actually has both null and undefined (http://www.w3schools.com/jsref/jsref_undefined.asp), but many other languages don't.
It would be easy enough to create a static constant indicating unknown, for the rare cases when you'd need such a thing.
var apple = Apple.Unknown;
while (apple == Apple.Unknown) {} // etc
Existence of value:
Python: vars().has_key('variableName')
PHP: isset(variable)
JavaScript: typeof(variable) != 'undefined'
Perl: (variable != undef) or if you wish: (defined variable)
Of course, when variable is undefined, it's not NULL
Why stop at two?
When I took databases in college, we were told that somebody (sorry, don't remember the name of the researcher or paper) had looked at a bunch of db schemas and found that null had something like 17 different meanings: "don't know yet", "can't be known", "doesn't apply", "none", "empty", "action not taken", "field not used", and so on.
In haskell you can define something like this:
data MaybeEither a b = Object a
| Unknown b
| Null
deriving Eq
main = let x = Object 5 in
if x == (Unknown [2]) then putStrLn ":-("
else putStrLn ":-)"
The idea being that Unknown values hold some data of type b that can transform them into known values (how you'd do that depends on the concrete types a and b).
The observant reader will note that I'm just combining Maybe and Either into one data type :)
The Null type is a subtype of all reference types - you can use null instead of a reference to any type of object - which severely weakens the type system. It is considered one of the a historically bad idea by its creator, and only exists as checking whether an address is zero is easy to implement.
As to why we don't have two nulls, is it down to the fact that, historically in C, NULL was a simple #define and not a distinct part of the language at all?
Within PHP Strict you need to do an isset() check for set variables (or else it throws a warning)
if(!isset($apple))
{
askForApple();
}
if(isset($apple) && empty($apple))
{
sulk();
}
else
{
eatApple();
}
In .net langages, you can use nullable types, which address this problem for value types.
The problem remains, however, for reference types. Since there's no such thing as pointers in .net (at least in 'safe' blocks), "object? o" won't compile.
Note null is an acceptable, yet known condition. An unknown state is a different thing IMO. My conversation with Dan in the comments' section of the top post will clarify my position. Thanks Dan!.
What you probably want to query is whether the object was initialized or not.
Actionscript has such a thing (null and undefined). With some restrictions however.
See documentation:
void data type
The void data type contains only one value, undefined. In previous versions of ActionScript, undefined was the default value for instances of the Object class. In ActionScript 3.0, the default value for Object instances is null. If you attempt to assign the value undefined to an instance of the Object class, Flash Player or Adobe AIR will convert the value to null. You can only assign a value of undefined to variables that are untyped. Untyped variables are variables that either lack any type annotation, or use the asterisk (*) symbol for type annotation. You can use void only as a return type annotation.
Some people will argue that we should be rid of null altogether, which seems fairly valid. After all, why stop at two nulls? Why not three or four and so on, each representing a "no value" state?
Imagine this, with refused, null, invalid:
var apple;
while (apple is refused)
{
askForApple();
}
if (apple is null)
{
sulk();
}
else if(apple is invalid)
{
discard();
}
else
{
eatApple(apple);
}
It's been tried: Visual Basic 6 had Nothing, Null, and Empty. And it led to such poor code, it featured at #12 in the legendary Thirteen Ways to Loathe VB article in Dr Dobbs.
Use the Null Object pattern instead, as others have suggested.
The problem is that in a strongly typed language these extra nulls are expected to hold specific information about the type.
Basically your extra null is meta-information of a sort, meta-information that can depend on type.
Some value types have this extra information, for instance many numeric types have a NaN constant.
In a dynamically typed language you have to account for the difference between a reference without a value (null) and a variable where the type could be anything (unknown or undefined)
So, for instance, in statically typed C# a variable of type String can be null, because it is a reference type. A variable of type Int32 cannot, because it is a value type it cannot be null. We always know the type.
In dynamically typed Javascript a variable's type can be left undefined, in which case the distinction between a null reference and an undefined value is needed.
Some people are one step ahead of you already. ;)
boolean getAnswer() throws Mu
Given how long it took Western philosophy to figure out how it was possible to talk about the concept of "nothing"... Yeah, I'm not too surprised the distinction got overlooked for a while.
I think having one NULL is a lower-common denominator to deal with the basic pattern
if thing is not NULL
work with it
else
do something else
In the "do something else" part, there are a wide range of possibilities from "okay, forget it" to trying to get "thing" somewhere else. If you don't simply ignore something that's NULL, you probably need to know why "thing" was NULL. Having multiple types of NULL, would help you answering that question, but the possible answers are numerous as hinted in the other answers here. The missing thing could be simply a bug, it could be an error when trying to get it, it may not be available right now, and so on. To decide which cases apply to your code -- which means you have to handle them -- it domain specific. So it's better to use an application defined mechanism to encode these reasons instead of finding a language feature that tries to deal with all of them.
It's because Null is an artifact of the language you're using, not a programmer convenience. It describes a naturally occurring state of the object in the context in which it is being used.
If you are using .NET 3.0+ and need something else, you might try the Maybe Monad. You could create whatever "Maybe" types you need and, using LINQ syntax, process accordingly.
AppleInformation appleInfo;
while (appleInfo is null)
{
askForAppleInfo();
}
Apple apple = appleInfo.apple;
if (apple is null)
{
sulk();
}
else
{
eatApple(apple);
}
First you check if you have the apple info, later you check if there is an apple or not. You don't need different native language support for that, just use the right classes.
For me null represents lack of value, and I try to use it only to represent that. Of course you can give null as many meanings as you like, just like you can use 0 or -1 to represent errors instead of their numerical values. Giving various meanings to one representation could be ambiguous though, so I wouldn't recommend it.
Your examples can be coded like apple.isRefused() or !apple.isValid() with little work; you should define beforehand what is an invalid apple anyway, so I don't see the gain of having more keywords.
You can always create an object and assign it to same static field to get a 2nd null.
For example, this is used in collections that allow elements to be null. Internally they use a private static final Object UNSET = new Object which is used as unset value and thus allows you to store nulls in the collection. (As I recall, Java's collection framework calls this object TOMBSTONE instead of UNSET. Or was this Smalltalk's collection framework?)
VB6
Nothing => "There is no value."
Null = > "I don't know what the value is" - Same as DBNull.Value in .NET
Two nulls would be the wrongest answer around. If one null is not enough, you need infinity nulls.
Null Could mean:
'Uninitialized'
'User didn't specify'
'Not Applicable here, The color of a car before it has been painted'
'Unity: This domain has zero bits of information.'
'Empty: this correctly holds no data in this case, for example the last time the tires were rotated on a new car'
'Multiple, Cascading nulls: for instance, the extension of a quantity price when no quantity may be specified times a quantity which wasn't specified by the user anyway'
And your particular domain may need many other kinds of 'out of band' values. Really, these values are in the domain, and need to have a well defined meaning in each case. (ergo, infinity really is zero)

Resources