How to duplicate a RuleContext - antlr4

Is there any way to duplicate a ParserRule? I need a real deep copy, so copyFrom() doesn't do the trick. Or must I re-parse the code?
An alternative idea how to solve the following would also be much appreciated:
I am working on a compiler, translating old legacy code to modern programming languages, in this case EGL -> Java.
EGL has a concept called Standalone Function, which are similar to C-macros. This means that code inside the functions can reference symbols in the calling scope. So both defining and resolving of symbols and type promotion are context-dependent.
In ANTLR3, we solved this by dupTree(), and simply made a copy to work on in each calling scope.
Dynamic types is not an option.
Example (pseudo code) to illustrate:
Program A
int var = 4;
saf(); # Prints 5
end A;
Program B
String var = "abc";
saf(); # Prints abc1
end B;
function saf()
int j = 1;
print(var + j);
end saf;

As of version 4.2, ANTLR 4 does not include any API for manipulating the structure of a parse tree after the parse is complete. This is an area we are currently exploring, especially considering the possibilities created by the new pattern matching syntax.
For duplicating trees, I recommend you implement the visitor interface created when you generated your parser. This will allow you to call visit on any node in your parse tree to create a deep copy of that node.

Related

Is it safe to use Proc with the same name as Iterator in Nim?

I would like to define proc with the same name as iterator to be able to write short code table.keys.sorted.
And it seems Nim support that and resolve naming conflict correctly.
Is this an official feature of Nim that's going to be supported in future versions? Is it safe to use such approach?
Example
import tables, algorithm
var table = init_table[string, int]()
table["b"] = 2
table["a"] = 1
# Proc with same name as Iterator
proc keys*[K, V](table: Table[K, V]): seq[K] =
for k in table.keys: result.add k
# Nim properly resolves `keys` as `proc` and not as `iterator`
echo table.keys.sorted
The fact that you can define an iterator and a proc with same signature is currently regarded as design mistake (see issue #8901) but it will probably stick for a while.
Other options for your request of having short code are:
echo toSeq(table.keys).sorted
this uses toSeq from sequtils and unfortunately you cannot use UFCS with that (see github issue).
Another option (actually on top of that) would be to define a template sortedKeys that does the above .
Or you could argue that this is not a design mistake and we could think of it as a feature that allows you to use keys of a table as a sequence. :)

Writing a raw binary structure to file in D?

I'm trying to have a binary file which contains several binary records defined in some struct. However, I do cannot seem to find how to do it. Looking at other examples, I've managed to write strings without problems, but not struct. I just want to write it like I would in C with fwrite(3), but in D version 2.
Here is what I've tried so far:
using stream.write(tr) - writes human readable/debug representation
using stream.rawWrite(tr) - this sounded like what I need, but fails to compile with:
Error: template std.stdio.File.rawWrite cannot deduce function from
argument types !()(TitleRecord), candidates are:
/usr/lib/ldc/x86_64-linux-gnu/include/d/std/stdio.d(1132): std.stdio.File.rawWrite(T)(in T[] buffer)
trying rawWrite as above, but casting data to various things, also never compiles.
even trying to get back to C with fwrite, but can't get deep enough to get file descriptor I need.
Reading the docs has not been very helpful (writing strings works for me too, but not writing struct). I'm sure there must be simple way to do it, but I'm not able to find it.... Other SO questions did not help me. I D 1.0, it might have been accomplished with stream.writeExact(&tr, tr.sizeof) but that is no longer an option.
import std.stdio;
struct TitleRecord {
short id;
char[49] text;
};
TitleRecord tr;
void main()
{
auto stream = File("filename.dat","wb+");
tr.id = 1234;
tr.text = "hello world";
writeln(tr);
//stream.write(tr);
//stream.rawWrite(tr);
//stream.rawWrite(cast(ubyte[52]) tr);
//stream.rawWrite(cast(ubyte[]) tr);
//fwrite(&tr, 4, 1, stream);
}
For this that error is saying it expects an array not a struct. So one easy way to do it is to simply slice a pointer and give that to rawWrite:
stream.rawWrite((&tr)[0 .. 1]);
The (&tr) gets the address, thus converting your struct to a pointer. Then the [0 .. 1] means get a slice of it from the beginning, grabbing just one element.
Thus you now have a T[] that rawWrite can handle containing your one element.
Be warned if you use the #safe annotation this will not pass, you'd have to mark it #trusted. Also of course any references inside your struct (including string) will be written as binary pointers instead of data as you surely know from C experience. But in the case you showed there you're fine.
edit: BTW you could also just use fwrite if you like, copy/pasting the same code over from C (except it is foo.sizeof instead of sizeof foo). The D File thing is just a small wrapper around C's FILE* and you can get the original FILE* back out to pass to the other functions with stream.getFP() http://dpldocs.info/experimental-docs/std.stdio.File.getFP.html )
rawWrite expects an array, but there are many workarounds.
One is to create a single element array.
file.rawWrite([myStruct]);
Another one is casting the struct into an array. My library called bitleveld has a function for that called reinterpretAsArray. This also makes it easy to create checksums of said structs.
Once in a while I've encountered issues with alignment using this method, so be careful. Could be fixed by changing the align property of the struct.

squeak(smalltallk) how to 'inject' string into string

I'm writing a class named "MyObject".
one of the class methods is:
addTo: aCodeString assertType: aTypeCollection
when the method is called with aCodeString, I want to add (in runtime) a new method to "MyObject" class which aCodeString is it's source code and inject type checking code into the source code.
for example, if I call addTo: assertType: like that:
a := MyObject new.
a addTo: 'foo: a boo:b baz: c
^(a*b+c)'
assertType: #(SmallInteger SmallInteger SmallInteger).
I expect that I could write later:
answer := (a foo: 2 boo: 5 baz: 10).
and get 20 in answer.
and if I write:
a foo: 'someString' boo: 5 baz: 10.
I get the proper message because 'someString' is not a SmallInteger.
I know how to write the type checking code, and I know that to add the method to the class in runtime I can use 'compile' method from Behavior class.
the problem is that I want to add the type checking code inside the source code.
I'm not really familiar with all of squeak classes so I'm not sure if I rather edit the aCodeString as a string inside addTo: assertType: and then use compile: (and I don't know how to do so), or that there is a way to inject code to an existing method in Behavior class or other squeak class.
so basically, what I'm asking is how can I inject string into an existing string or to inject code into an existing method.
There are many ways you could achieve such type checking...
The one you propose is to modify the source code (a String) so as to insert additional pre-condition type checks.
The key point with this approach is that you will have to insert the type checking at the right place. That means somehow parsing the original source (or at least the selector and arguments) so as to find its exact span (and the argument names).
See method initPattern:return: in Parser and its senders. You will find quite low level (not most beautiful) code that feed the block (passed thru return: keyword) with sap an Array of 3 objects: the method selector, the method arguments and the method precedence (a code telling if the method is connected to unary, binary or keyword message). From there, you'll get enough material for achieving source code manipulation (insert a string into another with copyReplace:from:to:with:).
Do not hesitate to write small snippets of code and execute in the Debugger (select code to debug, then use debug it menu or ALT+Shift+D). Also use the inspectors extensively to gain more insight on how things work!
Another solution is to parse the whole Abstract Syntax Tree (AST) of the source code, and manipulate that AST to insert the type checks. Normally, the Parser builds the AST, so observe how it works. From the modified AST, you can then generate new CompiledMethod (the bytecode instructions) and install it in methodDictionary - see the source code of compile: and follow the message sent until you discover generateMethodFromNode:trailer:. This is a bit more involved, and has a bad side effect that the source code is now not in phase with generated code, which might become a problem once you want to debug the method (fortunately, Squeak can used decompiled code in place of source code!).
Last, you can also arrange to have an alternate compiler and parser for some of your classes (see compilerClass and/or parserClass). The alternate TypeHintParser would accept modified syntax with the type hints in source code (once upon a time, it was implemented with type hints following the args inside angle brackets foo: x <Integer> bar: y <Number>). And the alternate TypeHintCompiler would arrange to compile preconditions automatically given those type hints. Since you will then be very advanced in Squeak, you will also create special mapping between source code index and bytecodes so as to have sane debugger and even special Decompiler class that could recognize the precondition type checks and transform them back to type hints just in case.
My advice would be to start with the first approach that you are proposing.
EDIT
I forgot to say, there is yet another way, but it is currently available in Pharo rather than Squeak: Pharo compiler (named OpalCompiler) does reify the bytecode instructions as objects (class names beginning with IR) in the generation phase. So it is also possible to directly manipulate the bytecode instructions by proper hacking at this stage... I'm pretty sure that we can find examples of usage. Probably the most advanced technic.

ANTLR get first production

I'm using ANTLR4 and, in particular, the C grammar available in their repo (grammar). It seems that the grammar hasn't an initial rule, so I was wondering how it's possible to get it. In fact, once initialized the parser, I attach my listener, but I obtain syntax errors since I'm trying to parse two files with different code instructions:
int a;
int foo() { return 0; }
In my example I call the parser with "parser.primaryExpression();" which is the first production of the "g4" file. Is it possible to avoid to call the first production and get it automatically by ANTLR instead?
In addition to #GRosenberg's answer:
Also the rule enum (in the generated parser) contains entries for each rule in the order they appear in the grammar and the first rule has the value 0. However, just because it's the first rule in the grammar doesn't mean that it is the main entry point. Only the grammar author knows what the real entry is and sometimes you might even want to parse only with a subrule, which makes this decision even harder.
ANTLR provides no API to obtain the first rule. However, in the parser as generated, the field
public static final String[] ruleNames = ....;
lists the rulenames in the order of occurrence in the grammar. With reflection, you can access the method.
Beware. Nothing in the Antlr 'spec' defines this ordering. Simply has been true to date.

Is it possible / easy to include some mruby in a nim application?

I'm currently trying to learn Nim (it's going slowly - can't devote much time to it). On the other hand, in the interests of getting some working code, I'd like to prototype out sections of a Nim app I'm working on in ruby.
Since mruby allows embedding a ruby subset in a C app, and since nim allows compiling arbitrary C code into functions, it feels like this should be relatively straightforward. Has anybody done this?
I'm particularly looking for ways of using Nim's funky macro features to break out into inline ruby code. I'm going to try myself, but I figure someone is bound to have tried it and /or come up with more elegant solutions than I can in my current state of learning :)
https://github.com/micklat/NimBorg
This is a project with a somewhat similar goal. It targets python and lua at the moment, but using the same techniques to interface with Ruby shouldn't be too hard.
There are several features in Nim that help in interfacing with a foreign language in a fluent way:
1) Calling Ruby from Nim using Nim's dot operators
These are a bit like method_missing in Ruby.
You can define a type like RubyValue in Nim, which will have dot operators that will translate any expression like foo.bar or foo.bar(baz) to the appropriate Ruby method call. The arguments can be passed to a generic function like toRubyValue that can be overloaded for various Nim and C types to automatically convert them to the right Ruby type.
2) Calling Nim from Ruby
In most scripting languages, there is a way to register a foreign type, often described in a particular data structure that has to be populated once per exported type. You can use a bit of generic programming and Nim's .global. vars to automatically create and cache the required data structure for each type that was passed to Ruby through the dot operators. There will be a generic proc like getRubyTypeDesc(T: typedesc) that may rely on typeinfo, typetraits or some overloaded procs supplied by user, defining what has to be exported for the type.
Now, if you really want to rely on mruby (because you have experience with it for example), you can look into using the .emit. pragma to directly output pieces of mruby code. You can then ask the Nim compiler to generate only source code, which you will compile in a second step or you can just change the compiler executable, which Nim will call when compiling the project (this is explained in the same section linked above).
Here's what I've discovered so far.
Fetching the return value from an mruby execution is not as easy as I thought. That said, after much trial and error, this is the simplest way I've found to get some mruby code to execute:
const mrb_cc_flags = "-v -I/mruby_1.2.0_path/include/ -L/mruby_1.2.0_path/build/host/lib/"
const mrb_linker_flags = "-v"
const mrb_obj = "/mruby_1.2.0_path/build/host/lib/libmruby.a"
{. passC: mrb_cc_flags, passL: mrb_linker_flags, link: mrb_obj .}
{.emit: """
#include <mruby.h>
#include <mruby/string.h>
""".}
proc ruby_raw(str:cstring):cstring =
{.emit: """
mrb_state *mrb = mrb_open();
if (!mrb) { printf("ERROR: couldn't init mruby\n"); exit(0); }
mrb_load_string(mrb, `str`);
`result` = mrb_str_to_cstr(mrb, mrb_funcall(mrb, mrb_top_self(mrb), "test_func", 0));
mrb_close(mrb);
""".}
proc ruby*(str:string):string =
echo ruby_raw("def test_func\n" & str & "\nend")
"done"
let resp = ruby """
puts 'this was a puts from within ruby'
"this is the response"
"""
echo(resp)
I'm pretty sure that you should be able to omit some of the compiler flags at the start of the file in a well configured environment, e.g. by setting LD_LIBRARY_PATH correctly (not least because that would make the code more portable)
Some of the issues I've encountered so far:
I'm forced to use mrb_funcall because, for some reason, clang seems to think that the mrb_load_string function returns an int, despite all the c code I can find and the documentation and several people online saying otherwise:
error: initializing 'mrb_value' (aka 'struct mrb_value') with an expression of incompatible type 'int'
mrb_value mrb_out = mrb_load_string(mrb, str);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~
The mruby/string.h header is needed for mrb_str_to_cstr, otherwise you get a segfault. RSTRING_PTR seems to work fine also (which at least gives a sensible error without string.h), but if you write it as a one-liner as above, it will execute the function twice.
I'm going to keep going, write some slightly more idiomatic nim, but this has done what I needed for now.

Resources