Semantically disambiguating an ambiguous syntax

Semantically disambiguating an ambiguous syntax - antlr4

Using Antlr 4 I have a situation I am not sure how to resolve. I originally asked the question at https://groups.google.com/forum/#!topic/antlr-discussion/1yxxxAvU678 on the Antlr discussion forum. But that forum does not seem to get a lot of traffic, so I am asking again here.
I have the following grammar:
expression
: ...
| path
;
path
: ...
| dotIdentifierSequence
;
dotIdentifierSequence
: identifier (DOT identifier)*
;
The concern here is that dotIdentifierSequence can mean a number of things semantically, and not all of them are "paths". But at the moment they are all recognized as paths in the parse tree and then I need to handle them specially in my visitor.
But what I'd really like is a way to express the dotIdentifierSequence usages that are not paths into the expression rule rather than in the path rule, and still have dotIdentifierSequence in path to handle path usages.
To be clear, a dotIdentifierSequence might be any of the following:
A path - this is a SQL-like grammar and a path expression would be like a table or column reference in SQL, e.g. a.b.c
A Java class name - e.g. com.acme.SomeJavaType
A static Java field reference - e.g. com.acme.SomeJavaType.SOME_FIELD
A Java enum value reference - e.g. com.acme.Gender.MALE
The idea is that during visitation "dotIdentifierSequence as a path" resolves as a very different type from the other usages.
Any idea how I can do this?

The issue here is that you're trying to make a distinction between "paths" while being created in the parser. Constructing paths inside the lexer would be easier (pseudo code follows):
grammar T;
tokens {
JAVA_TYPE_PATH,
JAVA_FIELD_PATH
}
// parser rules
PATH
: IDENTIFIER ('.' IDENTIFIER)*
{
String s = getText();
if (s is a Java class) {
setType(JAVA_TYPE_PATH);
} else if (s is a Java field) {
setType(JAVA_FIELD_PATH);
}
}
;
fragment IDENTIFIER : [a-zA-Z_] [a-zA-Z_0-9]*;
and then in the parser you would do:
expression
: JAVA_TYPE_PATH #javaTypeExpression
| JAVA_FIELD_PATH #javaFieldExpression
| PATH #pathExpression
;
But then, of course, input like this java./*comment*/lang.String would be tokenized wrongly.
Handling it all in the parser would mean manually looking ahead in the token stream and checking if either a Java type, or field exists.
A quick demo:
grammar T;
#parser::members {
String getPathAhead() {
Token token = _input.LT(1);
if (token.getType() != IDENTIFIER) {
return null;
}
StringBuilder builder = new StringBuilder(token.getText());
// Try to collect ('.' IDENTIFIER)*
for (int stepsAhead = 2; ; stepsAhead += 2) {
Token expectedDot = _input.LT(stepsAhead);
Token expectedIdentifier = _input.LT(stepsAhead + 1);
if (expectedDot.getType() != DOT || expectedIdentifier.getType() != IDENTIFIER) {
break;
}
builder.append('.').append(expectedIdentifier.getText());
}
return builder.toString();
}
boolean javaTypeAhead() {
String path = getPathAhead();
if (path == null) {
return false;
}
try {
return Class.forName(path) != null;
} catch (Exception e) {
return false;
}
}
boolean javaFieldAhead() {
String path = getPathAhead();
if (path == null || !path.contains(".")) {
return false;
}
int lastDot = path.lastIndexOf('.');
String typeName = path.substring(0, lastDot);
String fieldName = path.substring(lastDot + 1);
try {
Class<?> clazz = Class.forName(typeName);
return clazz.getField(fieldName) != null;
} catch (Exception e) {
return false;
}
}
}
expression
: {javaTypeAhead()}? path #javaTypeExpression
| {javaFieldAhead()}? path #javaFieldExpression
| path #pathExpression
;
path
: dotIdentifierSequence
;
dotIdentifierSequence
: IDENTIFIER (DOT IDENTIFIER)*
;
IDENTIFIER
: [a-zA-Z_] [a-zA-Z_0-9]*
;
DOT
: '.'
;
which can be tested with the following class:
package tl.antlr4;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.tree.ParseTreeWalker;
public class Main {
public static void main(String[] args) {
String[] tests = {
"mu",
"tl.antlr4.The",
"java.lang.String",
"foo.bar.Baz",
"tl.antlr4.The.answer",
"tl.antlr4.The.ANSWER"
};
for (String test : tests) {
TLexer lexer = new TLexer(new ANTLRInputStream(test));
TParser parser = new TParser(new CommonTokenStream(lexer));
ParseTreeWalker.DEFAULT.walk(new TestListener(), parser.expression());
}
}
}
class TestListener extends TBaseListener {
#Override
public void enterJavaTypeExpression(#NotNull TParser.JavaTypeExpressionContext ctx) {
System.out.println("JavaTypeExpression -> " + ctx.getText());
}
#Override
public void enterJavaFieldExpression(#NotNull TParser.JavaFieldExpressionContext ctx) {
System.out.println("JavaFieldExpression -> " + ctx.getText());
}
#Override
public void enterPathExpression(#NotNull TParser.PathExpressionContext ctx) {
System.out.println("PathExpression -> " + ctx.getText());
}
}
class The {
public static final int ANSWER = 42;
}
which would print the following to the console:
PathExpression -> mu
JavaTypeExpression -> tl.antlr4.The
JavaTypeExpression -> java.lang.String
PathExpression -> foo.bar.Baz
PathExpression -> tl.antlr4.The.answer
JavaFieldExpression -> tl.antlr4.The.ANSWER

Related

ANTLR Visitor of a rule with alternatives

I've a grammar rule like this:
fctDecl
: id AS FUNCTION O_PAR (argDecl (COMMA argDecl)*)? C_PAR COLON (scalar|VOID)
(DECLARE LOCAL (varDecl SEMICOLON)+)?
DO (instruction)+ (RETURN id)? DONE;
When i'm in the visitFctDecl, I have to visit all children of FctDecl. As children can be different type of rule, how can I know which is the type of the current child?
#Override
public FunctionNode visitFctDecl(B314Parser.FctDeclContext ctx) {
for (int i = 0; i < ctx.children.size() ;i++)
{
//what kind of rule is ?
ctx.children.get(i);
}
return null;
}
I'm not sure at all how to use visitor as it should be.

Start by chopping up that big parser rule. Something like this perhaps:
fctDecl
: id AS FUNCTION fctParams COLON ( scalar | VOID ) localDecl doStat
;
fctParams
: O_PAR ( argDecl ( COMMA argDecl )* )? C_PAR
;
localDecl
: ( DECLARE LOCAL ( varDecl SEMICOLON )+ )?
;
doStat
: DO instruction+ ( RETURN id )? DONE
;
and then in your visitor, you simply return custom classes like this (I'm returning Maps and Lists, but that could be your own domain objects):
public class DemoVisitor extends B314BaseVisitor<Object> {
// fctDecl
// : id AS FUNCTION fctParams COLON ( scalar | VOID ) localDecl doStat
// ;
#Override
public Object visitFctDecl(B314Parser.FctDeclContext ctx) {
Map<String, Object> result = new LinkedHashMap<String, Object>();
result.put("id", ctx.id().getText());
result.put("fctParams", visit(ctx.fctParams()));
result.put("type", ctx.scalar() != null ? ctx.scalar().getText() : ctx.VOID().getText());
result.put("localDecl", visit(ctx.localDecl()));
result.put("doStat", visit(ctx.doStat()));
return result;
}
// fctParams
// : O_PAR ( argDecl ( COMMA argDecl )* )? C_PAR
// ;
//
// argDecl
// : varDecl
// ;
//
// varDecl
// : ID AS type
// ;
#Override
public Object visitFctParams(B314Parser.FctParamsContext ctx) {
List<Map<String, String>> declarations = new ArrayList<Map<String, String>>();
if (ctx.argDecl() != null) {
for (B314Parser.ArgDeclContext argDecl : ctx.argDecl()) {
Map<String, String> declaration = new LinkedHashMap<String, String>();
declaration.put(argDecl.varDecl().ID().getText(), argDecl.varDecl().type().getText());
declarations.add(declaration);
}
}
return declarations;
}
// localDecl
// : ( DECLARE LOCAL ( varDecl SEMICOLON )+ )?
// ;
#Override
public Object visitLocalDecl(B314Parser.LocalDeclContext ctx) {
// See `visitFctParams(...)` how to traverse `( varDecl SEMICOLON )+`
return "TODO";
}
// doStat
// : DO instruction+ ( RETURN id )? DONE
// ;
#Override
public Object visitDoStat(B314Parser.DoStatContext ctx) {
// See `visitFctParams(...)` how to traverse `instruction+`
return "TODO";
}
}
If you now run the following class that uses the visitor above:
public class Main {
public static void main(String[] args) {
String source =
"mu as function(i as integer, b as boolean): integer\n" +
"declare local x as square; y as square;\n" +
"do\n" +
" skip\n" +
"done";
B314Lexer lexer = new B314Lexer(CharStreams.fromString(source));
B314Parser parser = new B314Parser(new CommonTokenStream(lexer));
ParseTree fctDeclTree = parser.fctDecl();
Object result = new DemoVisitor().visit(fctDeclTree);
System.out.printf("source:\n\n%s\n\nresult:\n\n%s\n", source, result);
}
}
the following will be printed on your console:
source:
mu as function(i as integer, b as boolean): integer
declare local x as square; y as square;
do
skip
done
result:
{id=mu, fctParams=[{i=integer}, {b=boolean}], type=integer, localDecl=TODO, doStat=TODO}

Creation of custom comparator for map in groovy

I have class in groovy
class WhsDBFile {
String name
String path
String svnUrl
String lastRevision
String lastMessage
String lastAuthor
}
and map object
def installFiles = [:]
that filled in loop by
WhsDBFile dbFile = new WhsDBFile()
installFiles[svnDiffStatus.getPath()] = dbFile
now i try to sort this with custom Comparator
Comparator<WhsDBFile> whsDBFileComparator = new Comparator<WhsDBFile>() {
#Override
int compare(WhsDBFile o1, WhsDBFile o2) {
if (FilenameUtils.getBaseName(o1.name) > FilenameUtils.getBaseName(o2.name)) {
return 1
} else if (FilenameUtils.getBaseName(o1.name) > FilenameUtils.getBaseName(o2.name)) {
return -1
}
return 0
}
}
installFiles.sort(whsDBFileComparator);
but get this error java.lang.String cannot be cast to WhsDBFile
Any idea how to fix this? I need to use custom comparator, cause it will be much more complex in the future.
p.s. full source of sample gradle task (description of WhsDBFile class is above):
project.task('sample') << {
def installFiles = [:]
WhsDBFile dbFile = new WhsDBFile()
installFiles['sample_path'] = dbFile
Comparator<WhsDBFile> whsDBFileComparator = new Comparator<WhsDBFile>() {
#Override
int compare(WhsDBFile o1, WhsDBFile o2) {
if (o1.name > o2.name) {
return 1
} else if (o1.name > o2.name) {
return -1
}
return 0
}
}
installFiles.sort(whsDBFileComparator);
}

You can try to sort the entrySet() :
def sortedEntries = installFiles.entrySet().sort { entry1, entry2 ->
entry1.value <=> entry2.value
}
you will have a collection of Map.Entry with this invocation. In order to have a map, you can then collectEntries() the result :
def sortedMap = installFiles.entrySet().sort { entry1, entry2 ->
...
}.collectEntries()

sort can also take a closure as parameter which coerces to a Comparator's compare() method as below. Usage of toUpper() method just mimics the implementation of FilenameUtils.getBaseName().
installFiles.sort { a, b ->
toUpper(a.value.name) <=> toUpper(b.value.name)
}
// Replicating implementation of FilenameUtils.getBaseName()
// This can be customized according to requirement
String toUpper(String a) {
a.toUpperCase()
}

Auto Mapper : how to map Expressions

public IEnumerable<CustomBo> FindBy(Expression<Func<CustomBo, bool>> predicate)
{
Mapper.CreateMap<Expression<Func<CustomBo, bool>>, Expression<Func<Entity, bool>>>();
var newPredicate = Mapper.Map<Expression<Func<Entity, bool>>>(predicate);
IQueryable<Entity> query = dbSet.Where(newPredicate);
Mapper.CreateMap<Entity,CustomBo>();
var searchResult = Mapper.Map<List<CustomBo>>(query);
return searchResult;
}
I want to map customBo type to Entity Type..
Here customBo is my model and Entity is Database entity from edmx.
I'm using AutoMapper.
I'm Getting following Error
Could not find type map from destination type Data.Customer to source type Model.CustomerBO. Use CreateMap to create a map from the source to destination types.
Could not find type map from destination type Data.Customer to source type Model.CustomerBO. Use CreateMap to create a map from the source to destination types.
Any Suggession what I'm missiong here..
Thanks

I find a work around. I create my custom methods to map Expression.
public static class MappingHelper
{
public static Expression<Func<TTo, bool>> ConvertExpression<TFrom, TTo>(this Expression<Func<TFrom, bool>> expr)
{
Dictionary<Expression, Expression> substitutues = new Dictionary<Expression, Expression>();
var oldParam = expr.Parameters[0];
var newParam = Expression.Parameter(typeof(TTo), oldParam.Name);
substitutues.Add(oldParam, newParam);
Expression body = ConvertNode(expr.Body, substitutues);
return Expression.Lambda<Func<TTo, bool>>(body, newParam);
}
static Expression ConvertNode(Expression node, IDictionary<Expression, Expression> subst)
{
if (node == null) return null;
if (subst.ContainsKey(node)) return subst[node];
switch (node.NodeType)
{
case ExpressionType.Constant:
return node;
case ExpressionType.MemberAccess:
{
var me = (MemberExpression)node;
var newNode = ConvertNode(me.Expression, subst);
MemberInfo info = null;
foreach (MemberInfo mi in newNode.Type.GetMembers())
{
if (mi.MemberType == MemberTypes.Property)
{
if (mi.Name.ToLower().Contains(me.Member.Name.ToLower()))
{
info = mi;
break;
}
}
}
return Expression.MakeMemberAccess(newNode, info);
}
case ExpressionType.AndAlso:
case ExpressionType.OrElse:
case ExpressionType.LessThan:
case ExpressionType.LessThanOrEqual:
case ExpressionType.GreaterThan:
case ExpressionType.GreaterThanOrEqual:
case ExpressionType.Equal: /* will probably work for a range of common binary-expressions */
{
var be = (BinaryExpression)node;
return Expression.MakeBinary(be.NodeType, ConvertNode(be.Left, subst), ConvertNode(be.Right, subst), be.IsLiftedToNull, be.Method);
}
default:
throw new NotSupportedException(node.NodeType.ToString());
}
}
}
Now I'm calling it like
public CustomBo FindBy(Expression<Func<CustomBo, bool>> predicateId)
{
var newPredicate = predicateId.ConvertExpression<CustomBo, Entity>();
}
Still if anyone know how to do it by automapper then plz let me know.
Thanks

Looks like this was added after your asked your question: Expression Translation (UseAsDataSource)
Now all you have to do is dbSet.UseAsDataSource().For<CustomBo>().Where(expression).ToList();. Much nicer!

ANTLR 4: Dynamic Token

Given:
grammar Hbs;
var: START_DELIM ID END_DELIM;
START_DELIM: '{{';
END_DELIM: '}}';
I'd like to know how to change START_DELIM and END_DELIM at runtime to be for example <% and %>.
Does anyone know how to do this in ANTLR 4?
Thanks.

There's a way, but you'll need to tie your grammar to a target language (as of now, the only target is Java).
Here's a quick demo (I included some comments to clarify things):
grammar T;
#lexer::members {
// Some default values
private String start = "<<";
private String end = ">>";
public TLexer(CharStream input, String start, String end) {
this(input);
this.start = start;
this.end = end;
}
boolean tryToken(String text) {
// See if `text` is ahead in the CharStream.
for(int i = 0; i < text.length(); i++) {
if(_input.LA(i + 1) != text.charAt(i)) {
// Nope, we didn't find `text`.
return false;
}
}
// Since we found the text, increase the CharStream's index.
_input.seek(_input.index() + text.length() - 1);
return true;
}
}
parse
: START ID END
;
START
: {tryToken(start)}? .
// The `.` is needed because a lexer rule must match at least 1 char.
;
END
: {tryToken(end)}? .
;
ID
: [a-zA-Z]+
;
SPACE
: [ \t\r\n] -> skip
;
The { ... }? is a semantic predicate. See: https://github.com/antlr/antlr4/blob/master/doc/predicates.md
Here's a small test class:
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
public class Main {
private static void test(TLexer lexer) throws Exception {
TParser parser = new TParser(new CommonTokenStream(lexer));
ParseTree tree = parser.parse();
System.out.println(tree.toStringTree(parser));
}
public static void main(String[] args) throws Exception {
// Test with the default START and END.
test(new TLexer(new ANTLRInputStream("<< foo >>")));
// Test with a custom START and END.
test(new TLexer(new ANTLRInputStream("<? foo ?>"), "<?", "?>"));
}
}
Run the demo as follows:
*nix
java -jar antlr-4.0-complete.jar T.g4
javac -cp .:antlr-4.0-complete.jar *.java
java -cp .:antlr-4.0-complete.jar Main
Windows
java -jar antlr-4.0-complete.jar T.g4
javac -cp .;antlr-4.0-complete.jar *.java
java -cp .;antlr-4.0-complete.jar Main
And you'll see the following being printed to the console:
(parse << foo >>)
(parse <? foo ?>)

Casting on run time using implicit con version

I have the following code which copies property values from one object to another objects by matching their property names:
public static void CopyProperties(object source, object target,bool caseSenstive=true)
{
PropertyInfo[] targetProperties = target.GetType().GetProperties(BindingFlags.Public | BindingFlags.Instance);
PropertyInfo[] sourceProperties = source.GetType().GetProperties(BindingFlags.Public | BindingFlags.Instance);
foreach (PropertyInfo tp in targetProperties)
{
var sourceProperty = sourceProperties.FirstOrDefault(p => p.Name == tp.Name);
if (sourceProperty == null && !caseSenstive)
{
sourceProperty = sourceProperties.FirstOrDefault(p => p.Name.ToUpper() == tp.Name.ToUpper());
}
// If source doesn't have this property, go for next one.
if(sourceProperty ==null)
{
continue;
}
// If target property is not writable then we can not set it;
// If source property is not readable then cannot check it's value
if (!tp.CanWrite || !sourceProperty.CanRead)
{
continue;
}
MethodInfo mget = sourceProperty.GetGetMethod(false);
MethodInfo mset = tp.GetSetMethod(false);
// Get and set methods have to be public
if (mget == null)
{
continue;
}
if (mset == null)
{
continue;
}
var sourcevalue = sourceProperty.GetValue(source, null);
tp.SetValue(target, sourcevalue, null);
}
}
This is working well when the type of properties on target and source are the same. But when there is a need for casting, the code doesn't work.
For example, I have the following object:
class MyDateTime
{
public static implicit operator DateTime?(MyDateTime myDateTime)
{
return myDateTime.DateTime;
}
public static implicit operator DateTime(MyDateTime myDateTime)
{
if (myDateTime.DateTime.HasValue)
{
return myDateTime.DateTime.Value;
}
else
{
return System.DateTime.MinValue;
}
}
public static implicit operator MyDateTime(DateTime? dateTime)
{
return FromDateTime(dateTime);
}
public static implicit operator MyDateTime(DateTime dateTime)
{
return FromDateTime(dateTime);
}
}
If I do the following, the implicit cast is called and everything works well:
MyDateTime x= DateTime.Now;
But when I have a two objects that one of them has a DateTime and the other has MyDateTime, and I am using the above code to copy properties from one object to other, it doesn't and generate an error saying that DateTime can not converted to MyTimeDate.
How can I fix this problem?

One ghastly approach which should work is to mix dynamic and reflection:
private static T ConvertValue<T>(dynamic value)
{
return value; // This will perform conversion automatically
}
Then:
var sourceValue = sourceProperty.GetValue(source, null);
if (sourceProperty.PropertyType != tp.PropertyType)
{
var method = typeof(PropertyCopier).GetMethod("ConvertValue",
BindingFlags.Static | BindingFlags.NonPublic);
method = method.MakeGenericMethod(new[] { tp.PropertyType };
sourceValue = method.Invoke(null, new[] { sourceValue });
}
tp.SetValue(target, sourceValue, null);
We need to use reflection to invoke the generic method with the right type argument, but dynamic typing will use the right conversion operator for you.
Oh, and one final request: please don't include my name anywhere near this code, whether it's in comments, commit logs. Aargh.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Semantically disambiguating an ambiguous syntax - antlr4

Related

ANTLR Visitor of a rule with alternatives

Creation of custom comparator for map in groovy

Auto Mapper : how to map Expressions

ANTLR 4: Dynamic Token

Casting on run time using implicit con version

Categories

Resources