ArchUnit: how can one silence references to arrays of basic types for method mayOnlyAccessLayers(...)? - archunit

For a big GWT-based application I am defining an architecture check to prevent that our junior programmers accidentally reference classes that are not GWT-serializable and the resulting objects then can not make it "over the wire" (which then throws exceptions at runtime).
For this I am using a layer-check like so:
#ArchTest
void checkGWTAccess(JavaClasses classes) {
LayeredArchitecture layers = Architectures.layeredArchitecture()
.layer("Client").definedBy("..client..")
.layer("Shared").definedBy("..shared..")
... <many entries omitted here> ...
;
ArchRule rule = layers
.whereLayer("Client").mayOnlyAccessLayers("Shared", "DomDtos", "DomBase", "DomConfig",
... <many entries omitted here> ...
);
;
rule.check(classes);
"In principle" this works ok EXCEPT that I am still getting errors for all fields that are arrays of basic types or collections of basic types (i.e. int[] digits;, char[] buffer;, List<Integer> allowedValues;) as well as methods returning arrays or collections of basic types (i.e. int[] methodA(...) {...}, String[] methodB(...) {...}, List<Character> methodC(...) {...}, etc.).
These error read e.g.:
Field <com.example.client.ui.components.converter.TimeFieldConverter.timeNumberDigits> has type <[I> in (TimeFieldConverter.java:0)
IMHO this is a bug in ArchUnit. I can not fancy any reason how such arrays of basic types could possibly violate any layer-check rules. But I guess that's another discussion...
For now my question is: How can one exclude these types from a <layer>.mayOnlyAccessLayers(...)-check?
Or - alternatively - can one include these types to the list of allowed layers (if it's possible to identify the and/or declare them as "layer")?
The purpose is simply, to avoid getting any rule violations for these references, so that the tests run green.

Related

Ways of keeping ANTLR4 grammar target independent

I'm writing a grammar for C++ target, however I'd like to keep it working with Java as well since ANTLR comes with great tools that work for grammars with Java target. The book ("The Definitive ANTLR 4 Reference") says that the way of achieving target independence is to use listeners and/or visitors. There is one problem though. Any predicate, local variable, custom constructor, custom token class etc. that I might need introduces target language dependence that cannot be removed, at least according to the information I took from the book. Since the book might be outdated here are the questions:
Is there a way of declaring primitive variables in language independent way, something like:
item[$bool hasAttr]
:
type ( { $hasAttr }? attr | ) ID
;
where $bool would be translated to bool in C++, but to boolean in Java (workaround would be to use int in that case but most likely not in all potential targets)
Is there a way of declaring certain code fragments to be for specific target only, something like:
parser grammar testParser;
options
{
tokenVocab=testLexer;
}
#header
<lang=Cpp>{
#include "utils/helper.h"
}
<lang=Java>{
import test.utils.THelper;
}
#members
<lang=Cpp>{
public:
testParser(antlr4::TokenStream *input, utils::THelper *helper);
private:
utils::THelper *Helper;
public:
}
<lang=Java>{
public testParser(TokenStream input, THelper helper) {
this(input);
Helper = helper;
}
private THelper Helper;
}
start
:
(
<lang=Cpp>{ Helper->OnUnitStart(this); }
<lang=Java>{ Helper.OnUnitStart(this); }
unit
<lang=Cpp>{ _localctx = Helper->OnUnitEnd(this); }
<lang=Java>{ _localctx = Helper.OnUnitEnd(this); }
)*
EOF
;
...
For the time being I'm keeping two separate grammars changing the Java one and merging the changes to C++ one once I'm happy with the results, but if possible
I'd rather keep it in one file.
This target dependency is a real nuisance and I'm thinking for a while already how to get rid of that in a good way. Haven't still found something fully usable.
What you can do is to stay with syntax that both Java and C++ can understand (e.g. write a predicate like a function call: a: { isValid() }? b c; and implement such functions in a base class from which you derive your parser (ANTLR allows to specify such a base class via the grammar option superClass).
The C++ target also got a number of additional named actions which you can use to specify C++ specific stuff only.

MPS way of attaching additional attributes to concept's properties/references

I've a set of concepts that represent types of entities
Hrrr.
Sample concepts:
Loop with children loopCount: IntegerProperty[1]
HttpRequest with children url: StringProperty[1], hostName: StringProperty[1]
Both concepts extend AbstractTestElement concept (it defines common properties like name, comment, etc).
I want Loop and HttpRequest to be generated to baseLanguage as follows:
Loop:
Loop e = new Loop();
e.setProperty(new IntegerProperty("loopCount", node.loopCount));
HttpRequest:
HttpRequest e = new HttpRequest();
e.setProperty(new StringProperty("url", node.url));
e.setProperty(new IntegerProperty("host", node.hostName));
What I want is to have some common generator template that covers this common logic for setProperty so it is not repeated for different kinds of test elements.
Well, there are properties that require specific-to-test-element treatment, however there are often cases when properties are one-to-one translated, thus
Here's the question: how can I attach metadata to the Loop/HttpRequest concept configuration?
What is MPS-idiomatic way of doing that?
1) While I could use "names of properties" as names put into the new XXXProperty, however ideally I would use HttpRequest.HOST_PROPERTY_NAME kind of references, thus "names of properties" is not sufficient.
2) I might probably invent annotations and annotate properties of my concepts, it looks like MPS itself does not use that approach.
3) (ab)using concept's behaviors to return <quotation new StringProperty("url", node.url) > looks even more awkward.
I would rather not use 2. and 3. because both approaches add generator behavior into aspects of your languages which aren't aware of the fact how things will be generated. It basically tight couples you structure with your generator.
If you go for 1, you can still use that static class approach. By creating a new rootnode in the generator which is a java class and contains all your fields. And then have generic generator template that reduces the IntegerProperty and so on ... If they have a common super concept it should be fairly easy to do. You just have to make sure that the property is generated before the containing concept. That way you can still access the role of it in the parent and use that information to generate the field access.

Can one change/influence JAXB's code generation?

I was wondering whether one can influence the "style" of the code that JAXB generates from XML schema (.xsd) fles. E.g. I would like to:
emit a comment inside newly generated classes, specifically if the class is empty, since that triggers warnings in my environment.
change all setter-methods to return the object instead of "void", so one can do call-chaining like:
X someMethod() {
return new X().setFoo(5).setBar("something");
}
instead of the tedious:
X someMethod() {
X x = new (X);
x.setFoo(5);
x.setBar("something");
return x;
}
Is there some "template" anywhere that JAXB uses and that one could tweak, to achieve such things? Or is that all hard-coded?
M.
There is no template for modifying the generated code easily.
There is, however, a number of plugins. For instance: https://java.net/projects/jaxb2-commons/pages/Fluent-api which is just what you want according to your 2nd bullet.
There are other plugins, e.g. for annotations suppressing warnings - that may help against the 1st bullet.
As an extra, I'd like to mention that not generating Java classes from an XML schema but writing them by hand (plus annotations, of course) is a plausible alternative, provided the XML schema isn't too complex. It may have other advantages besides solving #1 and #2.

How to compare 2 xsd schema files for equivalent functionality

I would like to compare 2 XSD schemas A and B to determine that all instance documents valid to schema A would also be valid to schema B. I hope to use this to prove that even though schema A and B are "different" they are effectively the same. Examples of differences this would not trigger would be Schema A uses types and Schema B declares all of it's elements inline.
I have found lots of people talking about "smart" diff type tools but these would claim the two files are different because they have different text but the resulting structure is the same. I found some references to XSOM but I'm not sure if that will help or not.
Any thoughts on how to proceed?
Membrane SOA Model - Java API for WSDL and XML Schema
package sample.schema;
import java.util.List;
import com.predic8.schema.Schema;
import com.predic8.schema.SchemaParser;
import com.predic8.schema.diff.SchemaDiffGenerator;
import com.predic8.soamodel.Difference;
public class CompareSchema {
public static void main(String[] args) {
compare();
}
private static void compare(){
SchemaParser parser = new SchemaParser();
Schema schema1 = parser.parse("resources/diff/1/common.xsd");
Schema schema2 = parser.parse("resources/diff/2/common.xsd");
SchemaDiffGenerator diffGen = new SchemaDiffGenerator(schema1, schema2);
List<Difference> lst = diffGen.compare();
for (Difference diff : lst) {
dumpDiff(diff, "");
}
}
private static void dumpDiff(Difference diff, String level) {
System.out.println(level + diff.getDescription());
for (Difference localDiff : diff.getDiffs()){
dumpDiff(localDiff, level + " ");
}
}
}
After executing you get the output shown in listing 2. It is a List of
differences between the two Schema documents.
ComplexType PersonType has changed: Sequence has changed:
Element id has changed:
The type of element id has changed from xsd:string to tns:IdentifierType.
http://www.service-repository.com/ offers an online XML Schema Version Comparator tool that displays a report of the differences between two XSD that appears to be produced from the Membrane SOA Model.
My approach to this was to canonicalize the representation of the XML Schema.
Unfortunately, I can also tell you that, unlike canonicalization of XML documents (used, as an example, to calculate a digital signature), it is not that simple or even standardized.
So basically, you have to transform both XML Schemas to a "canonical form" - whatever the tool you build or use thinks that form is, and then do the compare.
My approach was to create an XML Schema set (could be more than one file if you have more namespaces) for each root element I needed, since I found it easier to compare XSDs authored using the Russian Doll style, starting from the PSVI model.
I then used options such as auto matching substitution group members coupled with replacement of substitution groups with a choice; removal of "superfluous" XML Schema sequences, collapsing of single option choices or moving minOccurs/maxOccurs around for single item compositors, etc.
Depending on what your XSD-aware comparison tool's features are, or you settle to build, you might also have to rearrange particles under compositors such as xsd:choice or xsd:all; etc.
Anyway, what I learned after all of it was that it is extremely hard to build a tool that would work nice for all "cool" XSD features out there... One test case I remember fondly was to deal with various xsd:any content.
I do wonder though if things have changed since...

Are string constants overrated?

It's easy to lose track of odd numbers like 0, 1, or 5. I used to be very strict about this when I wrote low-level C code. As I work more with all the string literals involved with XML and SQL, I find myself often breaking the rule of embedding constants in code, at least when it comes to string literals. (I'm still good about numeric constants.)
Strings aren't the same as numbers. It feels tedious and a little silly to create a compile-time constant that has the same name as its value (E.g. const string NameField = "Name";), and although the repetition of the same string literal in many locations seems risky, there's little chance of a typo thanks to copying and pasting, and when I refactor I'm usually doing a global search that involves changing more than just the name of the thing, like how it's treated functionally in relation to the things around it.
So, let's say you don't have a good XML serializer (or aren't in the mood to set one up). Which of these would you personally use (if you weren't trying to bow to peer pressure in some code review):
static void Main(string[] args)
{
// ...other code...
XmlNode node = ...;
Console.WriteLine(node["Name"].InnerText);
Console.WriteLine(node["Color"].InnerText);
Console.WriteLine(node["Taste"].InnerText);
// ...other code...
}
or:
class Fruit
{
private readonly XmlNode xml_node;
public Fruit(XmlNode xml_node)
{
this.xml_node = xml_node;
}
public string Name
{ get { return xml_node["Name"].InnerText; } }
public string Color
{ get { return xml_node["Color"].InnerText; } }
public string Taste
{ get { return xml_node["Taste"].InnerText; } }
}
static void Main(string[] args)
{
// ...other code...
XmlNode node = ...;
Fruit fruit_node = new Fruit(node);
Console.WriteLine(fruit_node.Name);
Console.WriteLine(fruit_node.Color);
Console.WriteLine(fruit_node.Taste);
// ...other code...
}
A defined constant is easier to refactor. If "Name" ends up being used three times and you change it to "FullName", changing the constant is one change instead of three.
For something like that it depends on how often the constant is used. If it's just in one place as per your example, then hard-coding is fine. If it's used in many different places, definitely use a constant. One typo could lead to hours of debugging if you're not careful, because your compiler isn't going to notice that you typed "Tsate" instead of "Taste", while it WILL notice that you typed fruit_node.Tsate instead of fruit_node.Taste.
Edit:
I see now that you mentioned copying and pasting, but if you're doing that you may also be losing the time you save by not creating a constant in the first place. With intellisense and auto-completion, you could have the constant out there in a few keystrokes, instead of going through the trouble of copy/paste.
As you probably guessed. The answer is: it depends on the context.
It depends on what the example code is part of. If it's just part of a small throw away system then hard coding the constants may be acceptable.
If it's part of a large, complex system and the constants will be used in mulitple files, I'd be more drawn to the second option.
As in many matters of programming, this is a matter of taste. The "laws" of proper programming were created from experience -- many people have been burned by global variables causing namespace or clarity problems, so Global Variables Are Evil. Many have used magic numbers, only to later discover that the number was wrong or needed changing. Text search is ill-suited to changing these values, so Constants In Code Are Evil.
But both are permitted, because sometimes they aren't evil. You need to make the decision yourself -- which leads to clearer code? Which is going to be better for maintainers? Does the reasoning behind the original rule apply to my situation? If I had to read or maintain this code later, how would I rather that it were written?
There is no absolute law of good coding style, because no two programmers' minds works exactly alike. The rule is to write the clearest, cleanest code that you can.
Personally, I'd load the fruit from the XML file in advance - something like:
public class Fruit
{
public Fruit(string name, Color color, string taste)
{
this.Name = name; this.Color = color; this.Taste = taste;
}
public string Name { get; private set; }
public Color Color { get; private set; }
public string Taste { get; private set; }
}
// ... In your data access handling class...
public static FruitFromXml(XmlNode node)
{
// create fruit from xml node with validation here
}
}
That way, the "fruit" isn't really tied to the storage.
I'd go with the constants. It is a little more work, but there is no performance impact. And even if you usually copy/paste the values, I've certainly had instances where I changed code when I typed and didn't realize that Visual Studio had focus. I'd much prefer these resulted in compile errors.
For the example given, where the Strings are used as keys to a map or dictionary, I would lean toward use of an enum (or other object) instead. You can often do much more with an enum than with a constant string. In addition, if some code is commented out, IDE's will often miss that when doing a refactor. Also, references to a String constant that are in comments may or may not be included in a refactor.
I will make a constant for a string when the string will be used in many locations, the string is long or complicated (such as a regex), or when a properly-named constant will make the code more obvious.
I prefer my typos, incomplete refactorings, and other bugs of this sort to fail to compile rather than to just fail to operate properly.
Like many other refactorings, it's an arguably optional additional step that leaves you with code that's less risky to maintain and is more easily grokked by the "next guy". If you're in a situation that rewards that kind of thing (most that I'm in do), go for it.
Yeah, pretty much.
I think developers in statically typed languages have an unhealthy fear of anything at all dynamic. Pretty much every line of code in a dynamically typed language is effectively a string literal, and they've been fine for years. For instance, in JavaScript technically this:
var x = myObject.prop1.prop2;
Is equivalent to this:
var x = window["myObject"]["prop1"]["prop2"]; // assuming global scope
But it is definitely not a standard practice in JavaScript to do this:
var OBJ_NAME = "myObject";
var PROP1_NAME = "prop1";
var PROP2_NAME = "prop2";
var x = window[OBJ_NAME][PROP1_NAME][PROP2_NAME];
That would just be ridiculous.
It still depends though, like if a string is used in numerous places and it's rather cumbersome/ugly to type ("name" vs. "my-custom-property-name-x"), then it's probably worth making a constant, even within a single class (at which point it's probably good to be internally consistent within the class and make all the other strings constants too).
Also, if you actually intend for other external users to interact with your library using these constants, then it's also a good idea to define publicly accessible constants and document that users should use those to interact with your library. However, a library which interacts via magic string constants is usually a bad practice and you should consider designing your library in such a way that you don't need to use magic constants to interact with it in the first place.
I think in the specific example you gave, where the strings are relatively simple to type and there are presumably no external users of your API who would expect to work with it using those string values (i.e. they're just for internal data manipulation), readable code is far more valuable than refactorable code, so I would just put the literals directly inline. Again, this is assuming I understand your exact use case specifically.
One thing nobody seemed to notice is that as soon as you define a constant, its scope becomes something to maintain and think about. This actually does have a cost, it's not free like everyone seems to think. Consider this:
Should it be private or public in my class? What if some other namespace/package has a need for the same value, should I now extract the constant to some global static class of constants? What if I now need it in other assemblies/modules, do I extract it further? All these things make the code less and less readable, harder to maintain, less pleasant to work with, and more complicated. All in the name of refactorability?
Usually, these "great refactorings" never occur, and when they do they require a complete rewrite anyway, with all new strings. And if you had been using some shared module before this great refactoring (as in the above paragraph) which didn't have these new strings which you now need, what then? Do you add them to the same shared module of constants (what if you don't have access to the code for this shared module)? Or do you keep them local to you, in which case there are now multiple scattered repositories of string constants, all at different levels, running the risk of duplicated constants all over the code? Once you get to this point (and believe me I've seen it), refactoring becomes moot, because while you'll get all your usages of your constants, you'll miss other people's usages of their constants, even though these constants have the same logical value as your constants and you're actually trying to change all of them.

Resources