I am totally new to IBM ODM and I have been given a set of rules to be designed in the IBM ODM rule designer.
Simple If else and conditional rules I managed to write but I am struck how to write regular expression related rules in the IBM ODM. Can someone please help.
I have Member variable of my XOM class which is a String and I need to validate if it contains only Numbers and having 8 characters as length.

As a long-time user of ODM/JRules, it's my opinion that this is not a high-value use of business rules and that in the long run these rules will not be worthwhile.
Having said that, it should be easy enough to write a couple BOM or XOM methods to do what you want.
boolean containsOnlyNumbers(String string) {}
Verbalization: "{0} contains only numbers"
int length(String string, int length) {}
Verbalization: "{0} is {1} characters long"
Define these methods as static, on any class you want, perhaps a Utility class created just for them. Fill in the body of the methods with Java code to do the obvious things. Then verbalize them so your rule reads nicely:
If X contains only numbers and X is 8 characters long then


Exclude some characters in Unicode category

I'm trying to implement a rule along the lines of "all characters in the Letter and Symbol Unicode categories except a few reserved characters." From the lexer rules, I know I can use \p{___} to match against Unicode categories, but I am unsure of how to handle excluding certain characters.
Looking at example grammars, I am led a few different directions. For example, the Java 9 grammar seems to use predicates in order to directly use Java's built in isJavaIdentifier() while others manually define every valid character.
How can I achieve this functionality?
Without target specific code, you will have to define the ranges yourself so that the chars you want to exclude are not part of these ranges. You cannot use \p{...} and then exclude certain characters from it.
With target specific code, you can do as in the Java 9 grammar:
#lexer::members {
boolean aCustomMethod(int character) {
// Your logic to see if 'character' is valid. You're sure
// that it's at least a char from \p{Letter} or \p{Symbol}
return true;
: [\p{Letter}\p{Symbol}] {aCustomMethod(_input.LA(-1))}?

ANTLR get first production

I'm using ANTLR4 and, in particular, the C grammar available in their repo (grammar). It seems that the grammar hasn't an initial rule, so I was wondering how it's possible to get it. In fact, once initialized the parser, I attach my listener, but I obtain syntax errors since I'm trying to parse two files with different code instructions:
int a;
int foo() { return 0; }
In my example I call the parser with "parser.primaryExpression();" which is the first production of the "g4" file. Is it possible to avoid to call the first production and get it automatically by ANTLR instead?
In addition to #GRosenberg's answer:
Also the rule enum (in the generated parser) contains entries for each rule in the order they appear in the grammar and the first rule has the value 0. However, just because it's the first rule in the grammar doesn't mean that it is the main entry point. Only the grammar author knows what the real entry is and sometimes you might even want to parse only with a subrule, which makes this decision even harder.
ANTLR provides no API to obtain the first rule. However, in the parser as generated, the field
public static final String[] ruleNames = ....;
lists the rulenames in the order of occurrence in the grammar. With reflection, you can access the method.
Beware. Nothing in the Antlr 'spec' defines this ordering. Simply has been true to date.

Types for strings escaped/encoded differently

Recently I am dealing with escaping/encoding issues. I have a bunch of APIs that receive and return Strings encoded/escaped differently. In order to clean up the mess I'd like to introduce new types XmlEscapedString, HtmlEscapedString, UrlEncodedString, etc. and use them instead of Strings.
The problem is that the compiler cannot check the encoding/escaping and I'll have runtime errors.
I can also provide "conversion" functions that escape/encode input as necessary. Does it make sense ?
The compiler can enforce that you pass the types through your encoding/decoding functions; this should be enough, provided you get things right at the boundaries (if you have a correctly encoded XmlEscapedString and convert it to a UrlEncodedString, the result is always going to be correctly encoded, no?). You could use constructors or conversion methods that check the escaping initially, though you might pay a performance penalty for doing so.
(Theoretically it might be possible to check a string's escaping at compile time using type-level programming, but this would be exceedingly difficult and only work on literals anyway, when it sounds like the problem is Strings coming in from other APIs).
My own compromise position would probably be to use tagged types (using Scalaz tags) and have the conversion from untagged String to tagged string perform the checking, i.e.:
import scalaz._, Scalaz._
sealed trait XmlEscaped
def xmlEscape(rawString: String): String ## XmlEscaped = {
//perform escaping, guaranteed to return a correctly-escaped String
Tag[String, XmlEscaped](escapedString)
def castToXmlEscaped(escapedStringFromJavaApi: String) = {
require(...) //confirm that string is properly escaped
Tag[String, XmlEscaped](escapedStringFromJavaApi)
def someMethodThatRequiresAnEscapedString(string: String ## XmlEscaped)
Then we use castToXmlEscaped for Strings that are already supposed to be XML-escaped, so we check there, but we only have to check once; the rest of the time we pass it around as a String ## XmlEscaped, and the compiler will enforce that we never pass a non-escaped string to a method that expects one.

Can I put one check on a Lexial element instead for on a number of parser rules?

I,m trying to use antlr4 with the IDL.g4 grammar, to implement some checks that our idl-files shall follow. One rule is about names. The rule are like:
ID contains only letters, digits and signle underscores,
ID begin with a letter,
ID end with a letter or digit.
ID is not a reserved Word in ADA, C, C++, Java, IDL
One way to do this check is to write a function that check a string for these properties and call it in the exit listeners for every rule that has an ID. E.g(refering to IDL.g4) in exitConst_decl(), exitInit_decl(), exitSimple_declarator() and a lot of more places. Maybe that is the correct way to do it. But I was thinking about putting that check directly on the lexical element ID. But don't know how to do that, or if it is possible at all.
Validating this type of constraint in the lexer would make it significantly more difficult to provide usable error messages for invalid identifiers. However, you can create a new parser rule identifier, and replace all references to ID in various parser rules to reference identifier instead.
: ID
You can then place your identifier validation logic inside of the single method enterIdentifier instead of all of the various rules that currently reference ID.

Get argument names in String Interpolation in Scala 2.10

As of scala 2.10, the following interpolation is possible.
val name = "someName"
val interpolated = s"Hello world, my name is $name"
Now it is also possible defining custom string interpolations, as you can see in the scala documentation in the "Advanced usage" section here
Now then, my question is... is there a way to obtain the original string, before interpolation, including any interpolated variable names, from inside the implicit class that is defining the new interpolation for strings?
In other words, i want to be able to define an interpolation x, in such a way that when i call
x"My interpolated string has a $name"
i can obtain the string exactly as seen above, without replacing the $name part, inside the interpolation.
Edit: on a quick note, the reason i want to do this is because i want to obtain the original string and replace it with another string, an internationalized string, and then replace the variable values. This is the main reason i want to get the original string with no interpolation performed on it.
Thanks in advance.
Since Scala's string interpolation can handle arbitrary expressions within ${} it has to evaluate the arguments before passing them to the formatting function. Thus, direct access to the variable names is not possible by design. As pointed out by Eugene, it is possible to get the name of a plain variable by using macros. I don't think this is a very scalable solution, though. After all, you'll lose the possibility to evaluate arbitrary expressions. What, for instance, will happen in this case:
x"My interpolated string has a ${"Mr. " + name}"
You might be able to extract the variable name by using macros but it might get complicated for arbitrary expressions. My suggestions would be: If the name of your variable should be meaningful within the string interpolation, make it a part of the data structure. For example, you can do the following:
case class NamedValue(variableName: String, value: Any)
val name = NamedValue("name", "Some Name")
x"My interpolated string has a $name"
The objects are passed as Any* to the x. Thus, you now can match for NamedValue within x and you can do specific things depending on the "variable name", which now is part of your data structure. Instead of storing the variable name explicitly you could also exploit a type hierarchy, for instance:
sealed trait InterpolationType
case class InterpolationTypeName(name: String) extends InterpolationType
case class InterpolationTypeDate(date: String) extends InterpolationType
val name = InterpolationTypeName("Someone")
val date = InterpolationTypeDate("2013-02-13")
x"$name is born on $date"
Again, within x you can match for the InterpolationType subtype and handle things according to the type.
It seems that's not possible. String interpolation seems like a compile feature that compiles the example to:
StringContext("My interpolated string has a ").x(name)
As you can see the $name part is already gone. It became really clear for me when I looked at the source code of StringContext:
If you define x as a macro, then you will be able to see the tree of the desugaring produced by the compiler (as shown by #EECOLOR). In that tree, the "name" argument will be seen as Ident(newTermName("name")), so you'll be able to extract a name from there. Be sure to take a look at macro and reflection guides at to learn how to write macros and work with trees.
