XSD: Enumeration duplicates - xsd

Recently, I discovered that one of our xs:enumeration types included the same value twice:
<xs:simpleType name="typ-TypeCodeRequest">
<xs:restriction base="xs:string">
<xs:enumeration value="B1"/>
<xs:enumeration value="B2"/>
<xs:enumeration value="B2"/>
<xs:enumeration value="B3"/>
<xs:enumeration value="B4"/>
<xs:enumeration value="B5"/>
</xs:restriction>
</xs:simpleType>
Now, an external partner complained about it, claiming that "that cannot ever work". This confused me somewhat, since my tries to find out wether or not duplicate entries in enumerations are allowed or not - even if pointless - were fruitless.
This was not detected as wrong by any validation, and did not result in any problems when generated into code and used with Apaches CXF framework. Are we handling this issue too lax, or is the external partner too strict?

Strictly speaking, the gist of your problem is really in clarifying the context in which one said that cannot ever work.
In terms of the XSD spec, your fragment is valid - so that person is wrong. Duplicate enumerations are annoying to read and most likely indicate a bug, due to a typo which maybe misses one of possible values... still, perfectly valid.
The XML Schema spec, in both 1.0 and 1.1 (section 4.3.5) has no restriction placed on the uniqueness of the enumerated values. It's all about It is an ·error· if any member of {value} is not in the ·value space· of {base type definition}.
Interestingly enough, both specs could've placed constraints in the "schema for schema" to ensure uniqueness... but none did.
To expand on this... It is easy to place redundant constraints; a sequence of enumerated values could also be written using regex patterns. For e.g.:
<?xml version="1.0" encoding="utf-8" ?>
<!-- XML Schema generated by QTAssistant/XSD Module (http://www.paschidev.com) -->
<xsd:schema targetNamespace="http://tempuri.org/XMLSchema.xsd" xmlns="http://tempuri.org/XMLSchema.xsd" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="test" type="test"/>
<xsd:simpleType name="test">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="y"/>
<xsd:enumeration value="n"/>
<xsd:pattern value="y|n"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
In this case, the pattern has no use... yet, the XSD spec does not flag this as wrong (even though it is superfluous, since the enumeration always wins).
Maybe that person's problem is caused by some program that binds XSD to something else... and that program is creating duplicate entries, on the assumption that enumerated values should be unique (which is a wrong assumption).
If I would be you, I would simply fix the XSD; and ensure that you're using some XSD static analysis tooling to validate that it doesn't happen in your releases (even though is valid).

I would suggest that their complaint is more politically rather than technically motivated.
However, it is clearly incorrect and if you did try to convert this enum to a type in say, c#, you'd be unable to cleanly do it.
eg: won't compile:
enum Color
{
White = 0,
Black = 1,
Orange = 2,
Orange = 3
}
So if it's clearly incorrect why not create a new version of your schema?

Related

Making an xsd schema extensible with a typed element

Given a schema that defines an element of a certain type, is it possible to allow that type to be extended, but still have that extension element be strongly typed? In other words, add some kind of extension point that can be used from an external schema to add elements that can only be used in this location?
Let's say the schema looks kinda like:
<xs:schema …>
<xs:element name="Match" type="tns:TNodeConstraint" />
<xs:complexType name="TNodeConstraint">
<xs:group ref="tns:Expression" />
</xs:complexType>
<xs:group name="Expression">
<xs:choice>
<xs:element name="And">
<xs:complexType … />
</xs:element>
<xs:element name="Or">
<xs:complexType … />
</xs:element>
<xs:element name="IsAbstract">
<xs:element name="IsExtern">
<!-- Some kind of extension point? -->
</xs:choice>
</xs:group>
</xs>
Is it possible to extend the Expression group so that a second, external schema could say that I can accept IsMyCustomConstraint here, but not IsMyCustomSortOrder? So this will be valid:
<Match>
<IsAbstract />
<IsExtern />
<IsMyCustomConstraint />
</Match>
But this would be invalid?
<Match>
<IsAbstract />
<IsExtern />
<IsMyCustomSortOrder />
</Match>
I don't want to use xs:any as that would allow putting a "sort order" where a constraint can go.
I can modify the original schema
I'm in control of what the namespaces of IsMyCustomConstraint and IsMyCustomSortOrder would be, and it's not important if they match the original schema or not.
is it possible to allow that type to be extended, but still have that extension element be strongly typed?
Definitely - this is described in detail, with examples, here: https://www.w3.org/TR/2000/WD-xmlschema-0-20000225/#DerivExt. As far as I can tell, you just need to declare one or more complex types that are extensions of your base type 'TNodeConstraint'.
XML schema has a rich set of facilities to support type inheritance including:
abstract base types (base type must be extended or restricted before use)
extension (new type allows more values than the base type)
restriction (new type allows fewer values than the base type)
control of whether further extensions/restrictions are allowed (final/block attributes)
I don't see any need to use a separate XSD for the extensions, although you can if you want to. You may find it useful to know about xsi:type, abstract types and the block/final attributes - all are described the XML Schema Part 0 - Primer mentioned above.

XSD Class generators: keeping track of element order

I got the following complex type within my XSD schema
<xs:complexType name="structure" mixed="true">
<xs:choice maxOccurs="unbounded">
<xs:element type="b" name="b" />
<xs:element type="a" name="a" />
</xs:choice>
</xs:complexType>
which allows me to state XML definitions like this:
<structure>
Hello <b>World</b>
Hello 2 <b>World 2</b>
<a>Hello3</a> <b>World3</b>
</structure>
Now I tried to generate XSD classes out of my schema, I tried both XSD.exe as well as XSD2Code. They both generate something like
class structure {
List<a> a;
List<b> b;
List<string> text;
}
My problem is, that I need to keep track in which order those elements where defined within the XML content of structure. Refering to the above example, I would like to know that the inner text "Hello" comes right before the first occurance of the b-element.
As this would obviously require a more specialized generator strategy, maybe I'm expecting too much, but: is there any XSD generator that can handle the object order or do I have to write my own classes?
Thank you in advance
I have never seen an XSD to code binding tool which would do what you need here, for sure not on the .NET platform - which you seem to imply as the target. This is one of those cases where roundtrip an XML is not possible, without loss of fidelity (deserialize, serialize then compare, it fails). Just for completeness, the /order option wouldn't work with xsd.exe, simply because in terms of the XSD you defined, there's no order really. It is, also, a limitation of what XSD can describe, which inevitably is reflected in tool implementations.

Troubles converting XSD to Java using JAXB

I'm trying to convert an XSD I have no control over to Java classes using JAXB. The errors I'm getting are :
[ERROR] cvc-pattern-valid: Value 'true' is not facet-valid with respect to pattern '0|1' for type 'BooleanType'.
line 139 of http://neon/meaweb/schema/common/meta/MXMeta.xsd
[ERROR] a-props-correct.2: Invalid value constraint value '1' in attribute 'mxencrypted'.
line 139 of http://neon/meaweb/schema/common/meta/MXMeta.xsd
The code in the XSD that contains the error is in:
<xsd:complexType name="MXCryptoType">
<xsd:simpleContent>
<xsd:extension base="xsd:base64Binary">
<xsd:attribute name="changed" type="ChangeIndicatorType" use="optional" />
<xsd:attribute name="mxencrypted" type="BooleanType" use="optional" default="1" />
</xsd:extension>
</xsd:simpleContent>
Specifically it's the attribute mxencrypted using the BooleanType. BooleanType is defined as
<xsd:simpleType name="BooleanType">
<xsd:restriction base="xsd:boolean">
<xsd:pattern value="0|1" />
</xsd:restriction>
</xsd:simpleType>
From searching around this seems to be a somewhat common case. From what I can tell the default in the mxencrypted line shouldn't be a 1? When I load the XSD into Liquid XML, the schema doesn't report errors. Validating the XSD here (http://www.utilities-online.info/xsdvalidation/#.UV3zkL_EW0s) reports the same errors as JAXB.
Is there a way to tell JAXB to ignore this problem and just generate the class ignoring the default?
Your question is similar to this one (and I've just updated it with relevant information). I am not aware of a way to tell JAXB to ignore it, since this error happens in the XSD schema processor (before JAXB's xjc starts to do its work actually).
The only way may be to filter out the default attributes; however, in this case it is obvious that the XSD designer intended to have a default value of true, which would not be the case with your generated code (Java defaults to false).
This could yield unwanted fracas, my recommendation would be to work with the XSD provider to get it fixed.
Maybe a sidebar, but I personally consider the use of defaults in XSDs as an interoperability monster : any XML processor that is not relying on the XSD would behave differently than one that does.

Is it possible for XML to have valid schema but no XML document?

I get doubt that are there some schemas which have a valid schema but don't have some XML documents?
If there are, could you please give me some examples?
Yes, it's possible to define schemas for which the set of valid documents is the empty set -- at least, in every schema language I know, and given a reasonable definition of "valid document".
In XSD, RNG, and DTDs, perhaps the simplest such schema is one which declares no elements. In XSD, this could be expressed this way:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
A simple non-vacuous schema can be unsatisfiable by declaring elements with unsatisfiable types:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="unsatisfiable">
<xs:complexType>
<xs:choice/>
</xs:complexType>
</xs:element>
</xs:schema>
Since xs:choice requires that at least one child of the choice be matched by the input, a choice with no children is unsatisfiable. And if the choice is required, as it is here, then the type as a whole is unsatisfiable.
Empty choices can also be used in Relax NG, though not in DTDs.
In Relax NG, it's also possible to declare satisfiable elements in a schema with no valid instances, as long as the root element or at least one required descendant of the root element is unsatisfiable. In XSD, by contrast, once you have any satisfiable element declarations or type definitions you no longer have an empty language: XSD provides no way to say, in the schema, what the outermost element must be at validation time.
In XSD, RNG, and DTDs it is also possible to make an element unsatisfiable by requiring that it contain undeclared elements. In DTD notation:
<!ELEMENT unsatisfiable (undeclared) >
Also, in any of these languages, it's possible to define schemas which are satisfiable only by infinite documents:
<!ELEMENT e (e) >
In XSD (and in Relax NG using XSD datatypes) it's possible to define empty simple types, too:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="unsatisfiable">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minExclusive value="0"/>
<xs:maxExclusive value="1"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:schema>
Some methods of defining empty types are forbidden by XSD: setting the minimum and maximum exclusive values to the same value, for example, will raise an XSD error. (The rationale here is that the majority in the WG thought that an empty type made no sense and also suffered from the illusion that they could effectively prevent the definition of empty types, at least in cases not involving regular-expression patterns. As the example shows, they were wrong.) In XSD 1.1, perhaps the cleanest and most obvious way to define a non-satisfiable simple type is to define an empty union type, but an even simpler way is to use the predefined xs:error type (which itself is defined as an empty union). This is not possible in XSD 1.0, which requires that unions have at least two member types.

Creating a valid XSD that is open using <all> and <any> elements

I need to specify a XSD for validating XML documents. The XSD will be used for a JAXB generation of Java bindings.
My problem is specifying optional elements which I do not know the names of and which I in general am not interested in parsing.
The structure of the XML documents is like:
<TRADE>
<TIME>12:12</TIME>
<MJELLO>12345</MJELLO>
<OPTIONAL>12:12</OPTIONAL>
<DATE>25-10-2011</DATE>
<HELLO>hello should be ignored</HELLO>
</TRADE>
The important thing is, that:
I can not assume any order, and the next XML document instance migtht have tags in a different order
I am only interested in parsing some of the tags, some are mandatory and some are optional
The XML documents can be extended with new elements which I am not interested in parsing
The structure of my XSD is like (not a valid xsd):
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- *********************************************** -->
<!-- Trade element definitions for the XML Documents -->
<!-- *********************************************** -->
<xs:complexType name="Trade">
<!-- Using the all construction ensures that the order does not matter -->
<xs:all>
<xs:element name="DATE" type="xs:string" minOccurs="1" maxOccurs="1" />
<xs:element name="TIME" type="xs:string" minOccurs="1" maxOccurs="1" />
<xs:element name="OPTIONAL" type="xs:string" minOccurs="0" maxOccurs="1" />
<xs:any minOccurs="0"/>
</xs:all>
</xs:complexType>
<!-- TRADE is the mandatory top-level tag -->
<xs:element name="TRADE" type="Trade"/>
</xs:schema>
So, in this example: DATE and TIME are mandatory (they must be in the XML exactly once), OPTIONAL might be present once and then I would like to specify, that all other tags are allowed. The order does not matter.
How do I specify a valid XSD for this?
This is a classic parser problem.
Basically, your BNF is:
Trade = whatever whatever*
whatever = "DATE" | "TIME" | anything
anything = a-z a-z*
But this is ambigous. The string "DATE" can both be accepted under the whatever rule as "DATE" and as anything.
So if you have
<TRADE>
<TIME>12:12</TIME>
<DATE>25-10-2011</DATE>
<DATE>25-12-2011</DATE>
</TRADE>
it is unclear whether that should be accepted or not.
It could be interpreted either one of
"TIME", "DATE", anything
anything, anything, "DATE"
anything, anything, anything
"TIME", "DATE", anything
"TIME", "DATE", "DATE"
etc.
It all boils down to: If you have a wildcard combined with random sequence, you cannot meaningfully decide which token matches which rule.
It especially does not make sense to have optional elements together with a wilcard.
You have two options:
use xs:sequence instead of xs:all
do not use wildcard
As I understand it, both options are in conflict with your wishes.
Perhaps you can construct a wildcard that matches everything except DATE, TIME etc.
Is it a hard requirement to have JAXB bindings to your "known" elements?
If not, you can basically have just <any maxoccurs="unbounded" processContents="skip"/> as your xsd, and then pick out the elements you are interested in from the DOM tree.
(See here how to use JAXB without data binding.)

Resources