Multiple references to same xsd element/group - xsd

I have to describe a choice between multiple region types which all contain "coordinates". Unfortunately it is not allowed to have multiple xsd elements with the same name - it does not matter if they are defined multiple times or just referenced to multiple times.
<xs:group name="Region">
<xs:choice>
<xs:group ref="tns:CircularRegion" />
<xs:group ref="tns:RectangularRegion" />
<xs:group ref="tns:PolygonalRegion" />
</xs:choice>
</xs:group>
With the referenced groups:
<xs:group name="Coordinates">
<xs:sequence>
<xs:element name="Latitude" type="xs:integer" />
<xs:element name="Longitude" type="xs:integer" />
</xs:sequence>
</xs:group>
<xs:group name="CircularRegion">
<xs:sequence>
<xs:group ref="tns:Coordinates" />
<xs:element name="Radius" type="xs:integer" />
</xs:sequence>
</xs:group>
<xs:group name="RectangularRegion">
<xs:sequence>
<xs:group ref="tns:Coordinates" />
<xs:group ref="tns:Coordinates" />
</xs:sequence>
</xs:group>
<xs:group name="PolygonalRegion">
<xs:sequence>
<xs:group minOccurs="3" maxOccurs="12" ref="tns:Coordinates" />
</xs:sequence>
</xs:group>
As "Latitude" and "Longitude" are referenced to multiple times, the validation process ends with an error (multiple definitions...).
Any idea how to solve this?
EDIT The error message (German) from "Liquid XML Studio 2012" validator:
Error Mehrere Definitionen des Elements 'Psid' verursachen ein
mehrdeutiges Inhaltsmodell. Ein Inhaltsmodell muss so gebildet werden,
dass während der Validierung einer Elementinformationssequenz das
darin direkt, indirekt oder implizit enthaltene Partikel, mit dem
versucht wird, jedes Element in der Sequenz zu validieren, wiederum
eindeutig bestimmt werden kann, ohne den Inhalt oder die Attribute
dieses Elements zu untersuchen und ohne dass beliebige Informationen
zu den Elementen im Rest der Sequenz benötigt werden.
In English (Google translate)
Error Several definitions of element Psid 'cause an ambiguous content
model. A content model must be formed such that during validation of
an element information sequence that is directly, indirectly or
implicitly contained particles that attempts to validate each element
in the sequence in turn can be uniquely determined without the content
or attributes of that item are required to investigate and without any
information about the items in the rest of the sequence.

The problem isn't multiple references to the Coordinates group - the problem is a violation of the Unique Particle Attribution rule (described as deterministic in the XML spec; the description there is easier to understand).
This is because you have a choice between CircularRegion and RectangularRegion, but both begin with the same <Latitude> element (from Coordinates).
If you imagine trying to parse an xml document that has an <Latitude> element in it, the parser can't tell if it's from a CircularRegion group or a RectangularRegion group just by looking at that element. (It could if it looked further ahead in the xml, but that's not allowed by the UPA rule). It's a specific kind of ambiguity: more than one particle (part of the schema) can be attributed to that element, so it's not a unique particle attribution.
The clearest solution to this is to wrap each of your choices in a unique element (e.g. <CircularRegion>, <RectangularRegion> and <PolygonalRegion>), by using complexTypes instead of groups.
However, I get the impression that you want the XML that your XSD describes (or would describe if it were allowed). A simple way to do that is to factor-out the common prefix e.g.
<xs:group name="Region">
<xs:sequence>
<xs:group ref="tns:Coordinates"/> <!-- common prefix -->
<xs:choice>
<xs:element name="Radius" type="xs:integer" /> <!-- Circular -->
<xs:group minOccurs="1" maxOccurs="11" ref="tns:Coordinates" />
<!-- Rect and Poly -->
</xs:choice>
</xs:sequence>
</xs:group>
BTW: I tested your original XSD and my xsd parser (xmllint), and it worked fine, parsing xml matching each choice. It did not flag the UPA problem.... which is odd. Despite this evidence, I'm positive it does violate the UPA rule, and xmllint is at fault. Can someone confirm or refute this please?
I also tested my solution, and it also works.
EDIT removed the second level of ambiguity that #SebastianMauthofer pointed out in the comments.

Related

XSD with no order and selective restriction

I need to validate an XML that contains element in random order and some of them must exist and some of them only once. BTW some elements can be nested recursively.
For example there is a room that should contain one door and any number of boxes and elements. Boxes Can contain other boxes or/and elements.
Example XML:
<Room>
<Element />
<Box>
<Box>
<Element />
<Box></Box>
<Element />
</Box>
<Element />
</Box>
<Door />
<Element />
</Room>
This example is very simple, but in my case there are a lot of elements that can be in <Room>. Recursion is not a problem. The problem is to make <Door> to be required and in any order with siblings that are not required.
UPD: the question is about XSD 1.0 because I use .NET and there are no free lib for XSD 1.1
From what i'm reading i think you might need to use schema (XSD) indicators.
Check following link for more information: Schema indicators
Random Order
From your question:
I need to validate an XML that contains element in random order
Possible answer:
Using the All indicator (see link) you can specify that the element are in random order.
All Indicator
The indicator specifies that the child elements can appear in any order, and that each child element must occur only once:
Occurances
From your question:
some of them must exist and some of then once
Possible answer:
If i'm understanding it correctly you want to specify the amount of times an element exist or can be used. This is called occurance and again can be found back on the following link. You'll have to determine minOccurs and maxOccurs following your requirements.
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
The "maxOccurs" indicator specifies the maximum number of times an element can occur:
The "minOccurs" indicator specifies the minimum number of times an element can occur:
Everything including examples can be found back on the XSD/Schema indicators.
How your XSD (xml schema) will probably look like:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Room" type="Room_T"/>
<xs:complexType name="Room_T">
<xs:all>
<xs:element name="Element" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="Box" type="Box_T" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="Door" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:all>
</xs:complexType>
<xs:complexType name="Box_T">
<xs:all>
<xs:element name="Element" type="xs:string" minOccurs="1" maxOccurs="unbounded"/>
<xs:element name="Box" type="Box_T" minOccurs="0" maxOccurs="1"/>
</xs:all>
</xs:complexType>
</xs:schema>
I didn't check if the code above is valid but i think it could definitely get you started!

Why does the validation of keyref depend on the ordering of the key element?

My document contains A elements with IDs and B Elements which reference the As, like this:
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:\\\refissue.xsd">
<A id="x"/>
<A id="y"/>
<B><Aref idref="x" /></B>
</root>
When I validate against my simple schema (see below) I get the following error:
cvc-identity-constraint.4.3: Key 'ref' with value 'x' not found for identity constraint of element 'root'.
If I change the ordering of the A element to
<A id="y"/>
<A id="x"/>
the document validates without any errors.
Why does the validation result depend on the ordering of the elements?
Is this a bug in the validator or in my schema?
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="A">
<xs:complexType>
<xs:attribute name="id" type="xs:ID" />
</xs:complexType>
<xs:key name="A.KEY">
<xs:selector xpath="." />
<xs:field xpath="#id" />
</xs:key>
</xs:element>
<xs:element maxOccurs="unbounded" name="B">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="1" name="Aref">
<xs:complexType>
<xs:attribute name="idref" type="xs:IDREF" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:keyref name="ref" refer="A.KEY">
<xs:selector xpath="B/Aref" />
<xs:field xpath="#idref" />
</xs:keyref>
</xs:element>
</xs:schema>
I tried the validation with Eclipse (which uses xerces, I think), xerces-c 3.1.1, xmlstarlet 1.5.0 and libxml2 2.7.8 and I get the error only with eclipse and xerces.
You're right, validity against an identity constraint should not depend on the order of elements in the input.
Here I think the problem is that the schema is not quite right, and Xerces is having trouble generating a useful diagnosis of the problem. (The fact that libxml doesn't report an error is just a consequence of its incomplete coverage of XSD.)
Your key constraint should be defined on the scope of the element within which the key values need to be unique -- so on the root element, not on the A element. (As defined, your A.KEY constraint requires that the string value of each A element be unique within that A element, which will always be the case. The fact that the id attribute is declared as being of type xs:ID does require uniqueness, of course. And similarly, the fact that the Aref idref attribute is declared as being of type xs:IDREF means that your key and keyref declarations are not actually doing much work here that's not already being done by ID and IDREF.)
Once you move the declaration of A.KEY to the declaration of the root element, Xerces and Saxon agree that the schema is OK and the document is valid.
I had a similar problem in Eclipse until the xs:key and the xs:keyref were both explicitly set to the same type. In my case I set to both to xs:string(I also was using xs:unique and a keyref reference to the unique but it seems to work the same way for key and keyref pairs).
So for example if the key is based on an element that looks like this:
<xs:complexType name="elementTypeWithKey'>
<xs:attribute name="theKey" type="xs:string"/>
</xs:complexType>
and the theKey attribute is explicitly xs:string, make sure that the attribute used as a keyRef is also explicitly xs:string:
<xs:complexType name="elementTypeWithKeyRef">
<xs:attribute name="theKeyRef" type="xs:string"/>
</xs:complexType>

xml schema maxOccurs = unbounded within xs:all

Is it possible to have a combination of xs:all and xs:sequence?
I've have a xml structure with an element probenode which consist of the elements name, id, url, tags, priority, statuws_raw, active. And a combination of device and group.
device and group can occur zero or more times...
the solution below doesn't work because it is not allowed to use unbounded for an element. within an all group.
<xs:complexType name="probenodetype">
<xs:all>
<xs:element name="name" type="xs:string" />
<xs:element name="id" type="xs:unsignedInt" />
<xs:element name="url" type="xs:string" />
<xs:element name="tags" />
<xs:element name="priority" type="xs:unsignedInt" />
<xs:element name="status_raw" type="xs:unsignedInt" />
<xs:element name="active" type="xs:boolean" />
<xs:element name="device" type="devicetype" minOccurs="0" maxOccurs="unbounded">
<!-- zie devicetype -->
</xs:element>
<xs:element name="group" type="grouptype" minOccurs="0" maxOccurs="unbounded">
<!-- zie grouptype -->
</xs:element>
</xs:all>
<xs:attribute name="noaccess" type="xs:integer" use="optional" />
</xs:complexType>
In XSD 1.0, the children of xs:all must have maxOccurs set to 1.
In XSD 1.1 this constraint is lifted.
So your alternatives appear to be:
Use an XSD 1.1 processor (Saxon or Xerces-J).
Use XSD 1.0 and impose an order on the children of probenodetype. This is a problem if the order in which the children appear carries information (so id followed by url is different from url followed by id ...).
In some simple cases it's feasible to write a content model that accepts precisely what you suggest you want, using only choice and sequence, but with seven required elements the resulting content model is likely to be too long and complex to be useful.
At this point some users give up and write a complex type with a repeatable OR-group and move the responsibility for checking that name, id, url, etc. all occur at least once and at most once into the application; that allows the generator of the XML not to have to worry about a fixed order (and opens a side channel for information leakage, which matters to some people) but also renders the schema somewhat less useful as documentation of the contract between data provider and data consumer.

xml schema: restricting occurrences to sibling element sequence cardinality

Given:
<xs:complexType name="SymbolsList" final="">
<xs:sequence>
<xs:element name="symbol" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="ComboList">
<xs:sequence>
<xs:element name="combo" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="symbol" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="comboName" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:element name="symbolsList" type="SymbolsList">
<xs:unique name="uniqueSymbol">
<xs:selector xpath="./symbol" />
<xs:field xpath="#name" />
</xs:unique>
</xs:element>
<xs:element name="combosList" type="ComboList">
<xs:unique name="uniqueCombo">
<xs:selector xpath="./combo" />
<xs:field xpath="#comboName" />
</xs:unique>
</xs:element>
I believe this defines a list of symbols and a list of combinations of those symbols.
The each entry in the list of symbols must have a unique name, and each entry in the list of combos must have a unique comboName.
What I'd like to know is if there is a way for me to restrict the number of allowed occurrences in the combosList sequence to at least the number of symbols defined in the symbol list.
I guess I'm asking whether or not cardinality restriction can be variable and if so, how to associate it's limitation?
I also want to make it so that the comboList elements (a single combo) can only use names of symbols defined in the symbolList element.
I think I can pull of that last part. I can't find anything anywhere that talks about limiting caridinal sizes of disparate element sequences to greater than or equal to one or the other.
Perhaps it's not possible.
XSD requires cardinality to constraints to be specified literally in the declaration; the kind of dynamic calculation you have in mind is not in XSD's design space.
In XSD 1.1 you can add an assertion to some common ancestor of SymbolsList and CombosList that requires
count(CombosList/combo) ge count(SymbolsList/symbol)
XSD 1.1 is supported by Saxon EE and by Xerces J (in the latter case you have to look for the 1.1 distribution, or did last I looked). (One caveat: Note that Xerces J does not support all of XPath 2.0 in assertions, and I haven't actually checked to see whether this assertion is covered by the minimal subset of XPath XSD requires of conforming 1.1 implementations. Investigate further before sinking a lot of time here.)

how to define xsd element with multiple option?

I have scenario where I have to use the same XSD element for different purpose so that my resulting XML can contain either one or more p tags but not all.
<p>some paragraph here </p>
<p>
<img src = "....." alt="......"/>
</p>
<p> <b> some text here <b> <p>
<p> ...... <g1> ........ <g2>.......<g3>........<p>
I am new to XML Schema, Thanks in advance.
The assumption I am making is that you're trying to define the p tag, by showing its different content models. The first thing is that by taking in text, you have to define its content as mixed. From there, you could use a repeating choice that lists all other elements, such as img, b, g1, g2, etc.
I am showing an excerpt from the XHTML XSD:
<xs:element name="p">
<xs:complexType mixed="true">
<xs:complexContent>
<xs:extension base="Inline">
<xs:attributeGroup ref="attrs" />
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:complexType name="Inline" mixed="true">
<xs:annotation>
<xs:documentation>
"Inline" covers inline or "text-level" elements
</xs:documentation>
</xs:annotation>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:group ref="inline" />
<xs:group ref="misc.inline" />
</xs:choice>
</xs:complexType>
etc.
A good learning might be to look at the XTHML XSD. You could use an XSD editor to investigate the structures associated with the p tag.

Resources