xs:assert always fails with xerces validation, but works in xmlspy - xsd

I'm setting up Schema for our xml input/output, and have run into an issue where XMLSpy validates ok, but Xerces fails on one of the xs:asserts.
I'm using the latest xerces, xerces-2_12_0-xml-schema-1.1.
I have included all the .jar files from that distribution (except the xercesSamples.jar)
The test code is:
SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/XML/XMLSchema/v1.1");
factory.setFeature("http://apache.org/xml/features/validation/cta-full-xpath-checking", true);
Schema schema = factory.newSchema(new File("C:/Imports/Test.xsd"));
validator = schema.newValidator();
validator.validate(new StreamSource("C:/Imports/Test.xml"));
I've trimmed the xsd file down to this:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:lit="http://www.w3schools.com" xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" targetNamespace="http://www.w3schools.com" elementFormDefault="qualified" attributeFormDefault="unqualified" vc:minVersion="1.1">
<xs:element name="MetrixXML">
<xs:complexType>
<xs:all>
<xs:element ref="lit:Page" minOccurs="1" maxOccurs="unbounded"/>
</xs:all>
<xs:attribute name="SchemaVersion" type="xs:float" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="Page">
<xs:complexType>
<xs:attribute name="ContentPositionRule" type="xs:string"/>
<xs:attribute name="FilePageNum" type="xs:nonNegativeInteger"/>
<xs:assert test="(//#SchemaVersion ge 2.1) or ((//#SchemaVersion lt 2.1) and not (#ContentPositionRule))"/>
</xs:complexType>
</xs:element>
</xs:schema>
The xml is:
<?xml version="1.0" encoding="UTF-8"?>
<MetrixXML xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com Test.xsd" SchemaVersion="2.1" >
<Page FilePageNum="1"/>
<Page ContentPositionRule="CenterEachPage"/>
</MetrixXML>
The error I get is:
org.xml.sax.SAXParseException: cvc-assertion: Assertion evaluation ('(//#SchemaVersion ge 2.1) or ((//#SchemaVersion lt 2.1) and not (#ContentPositionRule))') for element 'Page' on schema type '#AnonType_Page' did not succeed.
In XMLSpy, if I set SchemaVersion to 2.0, the assert fails. If I set it to 2.1, the assert succeeds.
Is there some Feature flag that I need to set?
Update:
Apparently XMLSpy is allowing things it shouldn't allow.
So, the desired test is that
if (SchemaVersion < 2.1) AND any element contains a "ContentPositionRule" attribute THEN it should fail.

Move the assertion up a level in the hierarchy and ensure that it references only descendents of the associated element:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:lit="http://www.w3schools.com"
xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning"
targetNamespace="http://www.w3schools.com"
elementFormDefault="qualified"
attributeFormDefault="unqualified" vc:minVersion="1.1">
<xs:element name="MetrixXML">
<xs:complexType>
<xs:all>
<xs:element ref="lit:Page" minOccurs="1" maxOccurs="unbounded"/>
</xs:all>
<xs:attribute name="SchemaVersion" type="xs:float" use="required"/>
<xs:assert test=" (#SchemaVersion ge 2.1) or
((#SchemaVersion lt 2.1) and
not (lit:Page/#ContentPositionRule))
</xs:complexType>
</xs:element>
<xs:element name="Page">
<xs:complexType>
<xs:attribute name="ContentPositionRule" type="xs:string"/>
<xs:attribute name="FilePageNum" type="xs:nonNegativeInteger"/>
</xs:complexType>
</xs:element>
</xs:schema>
An assertion is only allowed to reference the element on which it appears and that element's descendents – not its ancestors, siblings, etc.
See also:
How to access parent element in XSD assertion XPath?
XMLSpy's observed behavior
Although it's technically (albeit unhelpfully) conformant to provide no diagnostic assistance for assertions over siblings or ancestors of the element on which an assertion appears, XMLSpy should not be reporting differing validation results depending upon sibling or ancestor state.
W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures
Validation Rule: Assertion Satisfied
[...]
1.3 From the "partial" ·post-schema-validation infoset·, a data model instance is constructed as described in [XDM]. The root node of the
[XDM] instance is constructed from E; the data model instance contains
only that node and nodes constructed from the [attributes],
[children], and descendants of E. Note: It is a consequence of this
construction that attempts to refer, in an assertion, to the siblings
or ancestors of E, or to any part of the input document outside of E
itself, will be unsuccessful. Such attempted references are not in
themselves errors, but the data model instance used to evaluate them
does not include any representation of any parts of the document
outside of E, so they cannot be referred to.
Note: It is a consequence of this construction that attempts to refer, in an assertion, to the siblings or ancestors of E, or to any
part of the input document outside of E itself, will be unsuccessful.
Such attempted references are not in themselves errors, but the data
model instance used to evaluate them does not include any
representation of any parts of the document outside of E, so they
cannot be referred to.
[Emphasis added.]

Related

Use of the schema-element node test in XSD 1.1's assert test

I am trying to design a XML schema where a certain element may alternatively hold either a single element belonging to a substitution group or a collection of certain elements which I want to be free-order (like in "all").
Due to the limitations on the "all" type of groups I cannot nest it into a "choice", so I tried a design which is similar to the following:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" elementFormDefault="qualified" attributeFormDefault="unqualified" vc:minVersion="1.1">
<xs:element name="X" abstract="true"/>
<xs:element name="X1" substitutionGroup="X"/>
<xs:element name="X2" substitutionGroup="X"/>
<xs:element name="Y">
<xs:complexType>
<xs:all>
<xs:element ref="X" minOccurs="0"/>
<xs:element name="Z1" minOccurs="0" type="xs:string"/>
<xs:element name="Z2" minOccurs="0" type="xs:string"/>
<xs:element name="Z3" minOccurs="0" type="xs:string"/>
</xs:all>
<xs:assert test="not(schema-element(X)) or not(Z1 or Z2 or Z3)"/>
</xs:complexType>
</xs:element>
</xs:schema>
When the schema file is validated, I get the following error:
File C:\whereabouts\xsd-assertion-problem.xsd is not valid.
Assertion 'not(schema-element(X)) or not(Z)' is no valid XPath 2.0 expression.
Error location: xs:schema / xs:element / xs:complexType / xs:assert
Details
XPST0008: Element name not found in static context's in-scope element declarations
as-props-correct.2: Assertion 'not(schema-element(X)) or not(Z)' is no valid XPath 2.0 expression.
The question is: what is wrong here and how to fix it? When I read (https://www.w3.org/TR/xpath-31/#ERRXPST0008)[the description of XPath error condition], it explicitly excludes "an ElementName in an ElementTest", which should be the case here, so static analysis should not fail here. Or am I wrong?
Note that the substitution group for X is open for extension and finding all locations where references to X are made may be difficult, that's why I strongly prefer to use a schema-element-based test.
On the other hand, while writing Z1 or Z2 or Z3 is also cumbersome, these elements are local, so this solution is more or less acceptable. Of course, if there are better ideas, they are welcome!
Just in case, I rely on the Altova engine.

How can I define an XML schema element that allows either base64 content or an xop:Include element?

I have a XML schema that defines an element that may be either base64 text or an xop:Include element. Currently, this is defined as a base64Binary type:
<xs:element name="PackageBinary" type="xs:base64Binary" minOccurs="1" maxOccurs="1"/>
When I insert the xop:Include element instead, it looks like this:
<PackageBinary>
<xop:Include xmlns:xop="http://www.w3.org/2004/08/xop/include" href="http://google.com/data.bin" />
</PackageBinary>
But this gives an XML validation error (I'm using .NET validator):
The element 'mds:xml-schema:soap11:PackageBinary' cannot contain child
element 'http://www.w3.org/2004/08/xop/include:Include' because the
parent element's content model is text only.
This makes sense because it's not base64 content, but I thought this was common practice...? Is there any way to support this in the schema? (We have existing product that supports this syntax but we are adding validation now.)
The best I could come up with was to create a complex type that allowed any tags but was also tagged as "mixed" so it allowed text. This doesn't explicitly declare the content as base64, but it does let it pass validation.
<xs:complexType name="PackageBinaryInner" mixed="true">
<xs:sequence>
<xs:any minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:element name="PackageBinary" type="PackageBinaryInner" minOccurs="1" maxOccurs="1"/>
The solution I've found is like this:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="http://example.org"
elementFormDefault="qualified"
xmlns="http://example.org"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xop="http://www.w3.org/2004/08/xop/include">
<xs:import namespace="http://www.w3.org/2004/08/xop/include"
schemaLocation="http://www.w3.org/2004/08/xop/include"/>
<xs:complexType name="PackageBinary" mixed="true">
<xs:all>
<xs:element ref="xop:Include"/>
</xs:all>
</xs:complexType>
I saw this in an xml document that appeared to allow validation - basically the attribute xmlns:xop="..." did the trick:
<SomeElement xmlns:xop="http://www.w3.org/2004/08/xop/include/" id="465390" type="html">
<SomeElementSummaryURL>https://file.someurl.com/SomeImage.html</SomeElementSummaryURL>
<xop:Include href="cid:1111111#someurl.com"/>
</SomeElement >

Why does the validation of keyref depend on the ordering of the key element?

My document contains A elements with IDs and B Elements which reference the As, like this:
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:\\\refissue.xsd">
<A id="x"/>
<A id="y"/>
<B><Aref idref="x" /></B>
</root>
When I validate against my simple schema (see below) I get the following error:
cvc-identity-constraint.4.3: Key 'ref' with value 'x' not found for identity constraint of element 'root'.
If I change the ordering of the A element to
<A id="y"/>
<A id="x"/>
the document validates without any errors.
Why does the validation result depend on the ordering of the elements?
Is this a bug in the validator or in my schema?
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="A">
<xs:complexType>
<xs:attribute name="id" type="xs:ID" />
</xs:complexType>
<xs:key name="A.KEY">
<xs:selector xpath="." />
<xs:field xpath="#id" />
</xs:key>
</xs:element>
<xs:element maxOccurs="unbounded" name="B">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="1" name="Aref">
<xs:complexType>
<xs:attribute name="idref" type="xs:IDREF" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:keyref name="ref" refer="A.KEY">
<xs:selector xpath="B/Aref" />
<xs:field xpath="#idref" />
</xs:keyref>
</xs:element>
</xs:schema>
I tried the validation with Eclipse (which uses xerces, I think), xerces-c 3.1.1, xmlstarlet 1.5.0 and libxml2 2.7.8 and I get the error only with eclipse and xerces.
You're right, validity against an identity constraint should not depend on the order of elements in the input.
Here I think the problem is that the schema is not quite right, and Xerces is having trouble generating a useful diagnosis of the problem. (The fact that libxml doesn't report an error is just a consequence of its incomplete coverage of XSD.)
Your key constraint should be defined on the scope of the element within which the key values need to be unique -- so on the root element, not on the A element. (As defined, your A.KEY constraint requires that the string value of each A element be unique within that A element, which will always be the case. The fact that the id attribute is declared as being of type xs:ID does require uniqueness, of course. And similarly, the fact that the Aref idref attribute is declared as being of type xs:IDREF means that your key and keyref declarations are not actually doing much work here that's not already being done by ID and IDREF.)
Once you move the declaration of A.KEY to the declaration of the root element, Xerces and Saxon agree that the schema is OK and the document is valid.
I had a similar problem in Eclipse until the xs:key and the xs:keyref were both explicitly set to the same type. In my case I set to both to xs:string(I also was using xs:unique and a keyref reference to the unique but it seems to work the same way for key and keyref pairs).
So for example if the key is based on an element that looks like this:
<xs:complexType name="elementTypeWithKey'>
<xs:attribute name="theKey" type="xs:string"/>
</xs:complexType>
and the theKey attribute is explicitly xs:string, make sure that the attribute used as a keyRef is also explicitly xs:string:
<xs:complexType name="elementTypeWithKeyRef">
<xs:attribute name="theKeyRef" type="xs:string"/>
</xs:complexType>

Is mixed inherited when a complexType is extended?

I have the following in a schema:
<xs:element name="td">
<xs:complexType>
<xs:complexContent>
<xs:extension base="cell.type"/>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:complexType name="cell.type" mixed="true">
<xs:sequence minOccurs="0" maxOccurs="unbounded">
<xs:element ref="p"/>
</xs:sequence>
</xs:complexType>
Some parsers allow PCDATA directly in the element, while others don't. There's something in the XSD recommendation (3.4.2) that says when a complex type has complex content, and neither has a mixed attribute, the effective mixed is false. That means the only way mixed content could be in effect is if the extension of cell.type causes the mixed="true" to be inherited.
Could someone more familiar with schemas comment on the correct interpretation?
(BTW: if I had control of the schema I would move the mixed="true" to the element definition, but that's not my call.)
Anyone reading my question might want to read this thread also (by Damien). It seems my answer isn't entirely right: parsers/validators don't handle mixed attribute declarations on base/derived elements the same way.
Concerning extended complex types, sub-section 1.4.3.2.2.1 of section 3.4.6 in part 1 of W3C's XML Schema specification says that
Both [derived and base] {content type}s must be mixed or both must be element-only.
So yes, it is inherited (or more like you cannot overwrite it—same thing in the end).
Basically, what you've described is the desired (and as far as I'm concerned) the most logical behavior.
I've created a simple schema to run a little test with Eclipse's XML tools.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="c">
<xs:complexType>
<xs:complexContent mixed="false">
<xs:extension base="a"/>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:complexType name="a" mixed="true">
<xs:sequence minOccurs="0" maxOccurs="unbounded">
<xs:element name="b"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
The above schema is valid, in the sense that not Eclipse's nor W3C's "official" XML Schema validator notices any issues with it.
The following XML passes validation against the aforementioned schema.
<?xml version="1.0" encoding="UTF-8"?>
<c xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="test.xsd">
x
<b/>
y
</c>
So basically you cannot overwrite the mixedness of a complex base type. To support this statement further, try and swap the base and dervied types' mixedness. In that case the XML fails to validate, because the derived type won't be mixed as it (yet again) cannot overwrite the base's mixedness.
You've also said that
Some parsers allow PCDATA directly in the element, while others don't
It couldn't hurt to clarify which parsers are you talking about. A good parser shouldn't fail when it encounters mixed content. A validating parser, given the proper schema, will fail if it encounters mixed content when the schema does not allow it.

What is wrong with extending an XML Schema type using xs:all?

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns="http://tempuri.org/ServiceDescription.xsd" xmlns:mstns="http://tempuri.org/ServiceDescription.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://tempuri.org/ServiceDescription.xsd" elementFormDefault="qualified" id="ServiceDescription">
<xs:element name="Template">
<xs:complexType>
<xs:complexContent>
<xs:extension base="ServiceType">
<xs:all>
<xs:element name="TemplateCode" type="xs:string"/>
</xs:all>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:complexType name="ServiceType">
<xs:all>
<xs:element name="ServiceCode" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:schema>
When I try to save this in XMLSpy it tells me
An 'all' model group is neither allowed in complex type definition 'mstns:ServiceType' nor in its extension '{anonymous}'.
Clicking Details gives a link to a paragraph in XML Schema specification which I do not understand.
Added: Ah, yes, forgot to mention - the line of error is this one:
<xs:element name="TemplateCode" type="xs:string"/>
The problem is you can't have all if you're extending another type. As far as XML knows the parent type might have a sequence model and since XML forbids putting an all group inside of a sequence group (since that would destroy the sequence group's ordering) then XML also forbids putting an all group in an extension of a complex type. You could use sequence instead of all for both though and you'd be fine.

Resources