ANTLR4 c++ target getLine()/getCharPositioninLine() in visitor - antlr4

What is the recommended way to get line/position data in a visitor? There's no way to get from a context to a token (at least there are no functions in the class definitions to allow this) so what is the recommended way to do so from contexts?
Using antlr4 and c++17
-- UPDATE --
Thanks Mike for pointing me in the right direction and prompt response. So here's my solution:
ctx->ID()->getSymbol()->getLine() or ->getCharPositionInLine()
where ID is the name of one of the TerminalNodes (one of your lexer rule names that can be in your context)

Both a terminal node as well as a parser context have token members which carry position information. For a parser context this is the start and end node of the range for which it applies. See the token implementation of the C++ target.

Related

SCIM PATCH library

I am implementing SCIM provisioning for my current project, and I am trying to implement the PATCH method and it seems not that easy.
What I read in the RFC is that SCIM PATCH is almost like JSON PATCH, but when I look deeper it seems a bit different on how the path is described which doesn't allow me to use json-patch libraries.
example:
"path":"addresses[type eq \"work\"]"
"path":"members[value eq
\"2819c223-7f76-453a-919d-413861904646\"]"
Do you know any library that is doing SCIM PATCH out of the box?
My project is currently a node project, but I don't care about the language I can rewrite it in javascript if needed.
Edit
I have finally create my own library for that, it is called scim-patch and it is available on npm https://www.npmjs.com/package/scim-patch
I implement SCIM PATCH operation in my own library. Please take a look here and here. It is currently a work in progress for v2, but the CRUD capability required by patch operations has matured.
First of all, you need a way to parse the SCIM path, which can optionally include a filter. I implement a finite state machine to parse the path and filter. A scanner would go through each byte of the text and point out interesting events, and a parser would use the scanner to break the text into meaningful tokens. For instance, emails[value eq "foo#bar.com"].type can be broken down to emails, [, eq, "foo#bar.com", ] and type. Finally, a compiler will take these token inputs and assemble it into an abstract syntax tree. On paper, it will look something like the following:
emails -> eq -> type
/ \
value "foo#bar.com"
Next, you need a way to traverse the resource data structure according to the abstract syntax tree. I designed my property model to carry a reference to the SCIM attribute. Consider the following resource:
{
"schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
"userName": "imulab",
"emails": [
{
"value": "foo#bar.com",
"type": "work"
},
{
"value": "bar#foo.com",
"type": "home"
}
]
}

I start traversing from the root of the resource and find the child called emails, which will return a multiValued property of complex type. I see my next token (eq) is the root of a filter, so I perform the filter operations on the two elements of emails. For each element, I go down the value child and evaluate its value. Since only the first element matches the filter, I finally go down the type child of that complex property and arrive at the target property. From there, you are free to perform Add, Replace and Remove operations.
There are two things I recommend to watch out.
One thing is that you traversing path will split when you hit a multiValued property. In the above example, we only have one elements that matched the filter. In reality, we may have many matches, or there could be no filter at all, forcing you to traverse all elements.
The other is the syntax of the SCIM path. The specification mandates that it is possible to prefix the schema URN in front the actual paths and delimit them with a :. So in that representation, emails.type and urn:ietf:params:scim:schemas:core:2.0:User:emails.type are actual equivalents. Note that the schema URN contains dots (.) in the 2.0 part. This creates further complication that now you cannot simply delimit the text by . and hope to get all correct tokens. I use a Trie data structure to record all schema URNs as reserved words. Whenever I start a new segment in the path, I will try to match it in the Trie and not solely rely on the . to terminate the segment.
Hope it will help your work.
Have a look at scim2-filter-parser: https://github.com/15five/scim2-filter-parser
It is a library mainly used by the authors' django-scim2 library: https://github.com/15five/django-scim2
It relies on python AST objects, but I think you should get some takeaways from there.
Since I did not found any typescript library to implement scim patch operations, I have implemented my own library.
You can find it here: https://www.npmjs.com/package/scim-patch

xtext inferrer: multiple entities

I am very new to Xtext/Xtend, therefore apologies in advance if the answer is obvious.
I would like to allow the end-users of my DSL to define a 'filter', that when applied and 'returns' true it means that they want to 'filter out' the given entity of data from consideration.
I want to allow them 2 ways of defining the filter
A) by introspecting the attributes of a given data object and apply basic rules like
if (obj.field1<CURRENT_DATE && obj.field2=="EXPIRED)
{ return true;} else {return false;}
B) by executing a controlled snippet using 'eval' of my host language
In other words, the user would be expected to type into a string/code block a valid
code snippet of the hosting language
I had decided that the easiest way for me support case A) would be to leverage the XBase rules (including expressions/etc)
Therefore I defined filters (mostly copying the ideas from Lorenzo's book)
Filter:
(FilterDSL | FilterCode);
FilterDSL:
'filterDSL' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
body=XBlockExpression ;
FilterCode:
'filterCode' (type=JvmTypeReference)? name=ID
'(' (params+=FullJvmFormalParameter (',' params+=FullJvmFormalParameter)*)? ')'
'{'
body=STRING
'}';
Now when trying to implement the Java mapping for my DSL, via the inferrer stub in Xtend -- I am running into multiple problems.
All of them likely indicate that I am missing some fundamental understanding
Problem 1) fl.body is not defined. fl Is of type Filter, not FilterDSL or FilterCode
And I do not understand how to check what type a given instance is of, so that I can access the content of a 'body' feature.
Problem 2) I do not understand where 'body' attribute in the inferrer method is defined and why. Is this part of ECore? (I could not find it)
Problem 3) what's the proper way to allow a user to specify a code block? String seems to be not the right thing as it does not allow multiline
Problem 4) How do I correctly convert a code block into something that is accepted by the 'body' such that it ends up in the generated code.
Problem 5) How do I setup multiple inferrers (as I have more than one thing for which I need the code generated (mostly) by xBase code generator)
Appreciate in advance any suggestions, or pointer to code examples solving similar problems.
As a side observation, Inferrer and its interplay with XBase has sofar been the most confusing and difficult thing to understand.
in general: have a look at the xtend docs at xtend-lang.org
You can do a if (x instanceof Type) or a switch statement with Type guards (see domain model example)
i dont get that question. both your FilterDSL and FilterCode EClasses should have a field+getter/setter named body, FilterCode of type String, FilterDSL of type XBlockExpression. The JvmTypesBuilder add extension methods to JvmOperation called setBody(String) and setBody(XExpression), syntax sugar lets you call body = .... instead of setBody(...)
(btw you can do crtl+click to find out where a thing is defined)
strings are actually multiline
is answered by (2)
you dont need multiple inferrers, you can infer multiple stuff e.g. by calling toClass or toField multiple times for the same input

what does getType do in antlr4?

This question is with reference to the Cymbol code from the book (~ page 143) :
int t = ctx.type().start.getType(); // in DefPhase.enterFunctionDecl()
Symbol.Type type = CheckSymbols.getType(t);
What does each component return: "ctx.type()", "start", "getType()" ? The book does not contain any explanation about these names.
I can "kind of" understand that "ctx.type()" refers to the "type" rule, and "getType()" returns the number associated with it. But what exactly does the "start" do?
Also, to generalize this question: what is the mechanism to get the value/structure returned by a rule - especially in the context of usage in a listener?
I can see that for an ID, it is:
String name = ctx.ID().getText();
And as in above, for an enumeration of keywords it is via "start.getType()". Any other special kinds of access that I should be aware of?
Lets disassemble problem step by step. Obviously, ctx is instance of CymbolParser.FunctionDeclContext. On page 98-99 you can see how grammar and ParseTree are implemented (at least the feeling - for real implementation please see th .g4 file).
Take a look at the figure of AST on page 99 - you can see that node FunctionDeclContext has a several children, one labeled type. Intuitively you see that it somehow correspond with function return-type. This is the node you retrieve when calling CymbolParser.FunctionDeclContext::type. The return type is probably sth like TypeContext.
Note that methods without 'get' at the beginning are usually children-getters - e.g. you can access the block by calling CymbolParser.FunctionDeclContext::block.
So you got the type context of the method you got passed. You can call either begin or end on any context to get first of last Token defining the context. Simply start gets you "the first word". In this case, the first Token is of course the function return-type itsef, e.g. int.
And the last call - Token::getType returns integral representation of Token.
You can find more information at API reference webpages - Context, Token. But the best way of understanding the behavior is reading through the generated ANTLR classes such as <GrammarName>Parser etc. And to be complete, I attach a link to the book.

U2 Toolkit for .NET - UniSession vs U2Connection

I'm struggling a bit with some of the base concepts of U2 Toolkit (and I've been quite successful with the previous version!).
First, I had to add using U2.Data.Client.UO; in order to reference UniSession or UniFile. This may just be general ignorance, but doesn't 'using U2.Data.Client' imply that I also want the .UO stuff under it?!?
Second - what (conceptually) are the differences between connecting via U2Connection's Open(), or UniSession's OpenSession()? Do each of them provide a different context in which to work?
Finally - while the examples provided in the doc and in Rajan's various articles are helpful, I'd like something a little more practical: how about a simple "here's how you read and write specific records in a Unidata file"?
Thanks!
Please see answer for the first and second questions
Regarding Namespace
If you want to develop application using ADO.NET ( SQL Access, UCI SERVER), you need one namespace (U2.Data.Client )
If you want to develop application using UO.NET ( Native Access, UO SERVER), you need two namespaces (U2.Data.Client and U2.Data.Client.UO)
U2.Data.Client namespace generally have Microsoft ADO.NET Specification Classes.
U2.Data.Client.UO namespace generally have UniObjects Native Specification Classes. As you have used in the past UODOTNET.DLL, you can feel all the Classes are there.
Regarding U2Connection/UniSession
This is by Design.
U2Connection.Open() calls UniSession.Open() when you use Accessmode=’Native’ in Connection String. You can verify from the LOG/TRACE File. In this case, basically, U2Connection and U2Session are same. U2Connection Class just passes connection string to UniSession Class and then UniSession Class uses this connection string and calls Open(). This is an improvement from the old way where you have used Static Class UniObjects(…) and there was no concept of standard connection string. Basically we replace Static Class UniObjects(…) to U2Connection Class and provided connection string capabilities.
U2Connection.Open() calls UCINET.Open() when you use Accessmode=’SQL’ in Connection String. You can verify from the LOG/TRACE File.
Is this clear()?

should it be allowed to change the method signature in a non statically typed language

Hypothetic and academic question.
pseudo-code:
<pre><code>
class Book{
read(theReader)
}
class BookWithMemory extends Book {
read(theReader, aTimestamp = null)
}
</pre></code>
Assuming:
an interface (if supported) would prohibit it
default value for parameters are supported
Notes:
PHP triggers an strict standards error for this.
I'm not surprised that PHP strict mode complains about such an override. It's very easy for a similar situation to arise unintentionally in which part of a class hierarchy was edited to use a new signature and a one or a few classes have fallen out of sync.
To avoid the ambiguity, name the new method something different (for this example, maybe readAt?), and override read to call readAt in the new class. This makes the intent plain to the interpreter as well as anyone reading the code.
The actual behavior in such a case is language-dependent -- more specifically, it depends on how much of the signature makes up the method selector, and how parameters are passed.
If the name alone is the selector (as in PHP or Perl), then it's down to how the language handles mismatched method parameter lists. If default arguments are processed at the call site based on the static type of the receiver instead of at the callee's entry point, when called through a base class reference you'd end up with an undefined argument value instead of your specified default, similarly to what would happen if there was no default specified.
If the number of parameters (with or without their types) are part of the method selector (as in Erlang or E), as is common in dynamic languages that run on JVM or CLR, you have two different methods. Create a new overload taking additional arguments, and override the base method with one that calls the new overload with default argument values.
If I am reading the question correctly, this question seems very language specific (as in it is not applicable to all dynamic languages), as I know you can do this in ruby.
class Book
def read(book)
puts book
end
end
class BookWithMemory < Book
def read(book,aTimeStamp = nil)
super book
puts aTimeStamp
end
end
I am not sure about dynamic languages besides ruby. This seems like a pretty subjective question as well, as at least two languages were designed on either side of the issue (method overloading vs not: ruby vs php).

Resources