defining different sets of child nodes by attribute value - xsd

I'm trying to define a schema for some xml-based database exchange like this:
<table name="foo">
<row>
<fooid>15</fooid>
<fooname>some entry</fooname>
</row>
<row>
<fooid>28</fooid>
<fooname>something else</fooname>
</row>
</table>
<table name="bar">
<row>
<barid>19</barid>
<barcounter>93</barcounter>
</row>
</table>
so I have several of these tables and within these tables there should be only the fields that exist in these tables. For example barid should not appear in table foo.
Is there any way to define this?

Yes, there are two ways. One is simple (and relies on some human intuition and documentation), and the other is more expressive (but inevitably also a bit more complicated.)
The simple way is to replace the names 'table' and 'row' with names that indicate what table we are talking about:
<table-foo>
<row-foo>
<fooid>28</fooid>
<fooname>something</fooname>
</row-foo>
...
</table-foo>
<table-bar>
<row-bar>
<barid>19</barid>
<barcounter>93</barcounter>
</row-bar>
...
</table-bar>
XSD validation (like validation using DTDs and Relax NG) is based principally on the element names used. If you want two different kinds of row to contain different things, give them two different names. So foo-table and its descendants can be declared thus:
<xs:element name="table-foo" substitutionGroup="tns:table">
<xs:complexType>
<xs:sequence>
<xs:element ref="tns:row-foo"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="row-foo" substitutionGroup="tns:row">
<xs:complexType>
<xs:sequence>
<xs:element ref="tns:fooid"/>
<xs:element ref="tns:fooname"/>
</xs:sequence>
</xs:complexType>
And similarly for bar-table and bar-row.
Sometimes, however, we absolutely must, or really want to, capture the fact that both 'row-foo' and 'row-bar' have something crucial in common. They are both 'rows' in some abstract ontology, and that may matter to us. In such cases, you can use abstract elements to capture the regularity.
For example, here is a simple abstraction for tables, rows, and cells:
<xs:element name="table"
abstract="true"
type="tns:table"/>
<xs:element name="row"
abstract="true"
type="tns:row"/>
<xs:element name="cell"
abstract="true"
type="xs:anySimpleType"/>
The types for table and row are straightforward:
<xs:complexType name="table">
<xs:sequence>
<xs:element ref="tns:row" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="row">
<xs:sequence>
<xs:element ref="tns:cell" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
Now, the declarations for table-foo etc. become slightly more complicated, because for each declaration we have to establish a relation to the abstraction we have just defined. Element foo-table is an instantiation of the table abstraction, and its type is a restriction of the abstract table type:
<xs:element name="table-foo"
substitutionGroup="tns:table">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="tns:table">
<xs:sequence>
<xs:element ref="tns:row-foo"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
Element foo-row is similar: we specify that it's a "row" by using the substitutionGroup attribute, and we derive its complex type by restriction from the abstract row type:
<xs:element name="row-foo" substitutionGroup="tns:row">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="tns:row">
<xs:sequence>
<xs:element ref="tns:fooid"/>
<xs:element ref="tns:fooname"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
Note that we don't allow arbitrary cells to appear here, just the two cell types we want for rows from table foo. And to close off the pattern, we declare that the elements fooid and fooname are cells, using (again) substitutionGroup.
<xs:element name="fooid" type="xs:integer"
substitutionGroup="tns:cell"/>
<xs:element name="fooname" type="xs:string"
substitutionGroup="tns:cell"/>
The same patterns can be used to declare a different set of legal cells for table bar:
<xs:element name="barid" type="xs:positiveInteger"
substitutionGroup="tns:cell"/>
<xs:element name="barcounter" type="xs:double"
substitutionGroup="tns:cell"/>
<xs:element name="table-bar" substitutionGroup="tns:table">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="tns:table">
<xs:sequence>
<xs:element ref="tns:row-bar"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:element name="row-bar" substitutionGroup="tns:row">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="tns:row">
<xs:sequence>
<xs:element ref="tns:barid"/>
<xs:element ref="tns:barcounter"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
The situation you describe is one of the use cases for which abstract elements and substitution groups were designed. Other techniques which could also be used here (but which I won't illustrate in detail) include:
Declared subtypes, use of xsi:type (declare foo-table and bar-table as restrictions or extensions of type table, use <table xsi:type="tns:foo-table">...</table> or <table xsi:type="tns:bar-table">...</table> to guide validation)
Assertions (declare foo-table and bar-table types which extend the generic table type by adding assertions about the grandchildren -- this is an XSD 1.1 feature not available in 1.0).
Conditional type assignment (declare that table gets one type if it has name="foo" and a different type if it has name="bar" -- also an XSD 1.1 feature not available in 1.0).
There may be other ways to do it, too.

Related

Unique constraint on a complexType instead of an element

I have the following XSD structure:
<xs:schema xmlns:ns="http://abc/">
...
<xs:element name="abc">
<xs:complexType>
<xs:sequence>
<xs:element ref="map"/>
</xs:sequence>
</xs:complexType>
</xs:element>
...
<xs:element name="map">
<xs:complexType>
<xs:sequence>
<xs:element name="entry" type="ns:MapEntryType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:unique name="entry">
<xs:selector xpath="entry"/>
<xs:field xpath="key"/>
</xs:unique>
</xs:element>
<xs:complexType name="MapEntryType">
<xs:sequence>
<xs:element name="key" type="xs:string"/>
<xs:element name="value" type="xs:anyType"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
This is doing its job.
The map element now has to be called something different based on whichever is the wrapper, so the name is sometimes map, sometimes properties, sometimes options, etc.
Therefore I want to genericize the map element.
I tried doing the following:
Making map a xs:complexType and changing ref to type.
This resulted in xs:unique not being accepted and failed
Making map a xs:complexType, changing ref to type and moving the xs:unique constraint to the element definitions.
This worked but resulted in the XSD having a lot of xs:unique present in the document.
Isn't there a way to simply tell that I want a specific structure and it containing unique elements without having to repeat the unique constraint everywhere?
As Petru Gardea said in his answer
Both XSD 1.0 and 1.1 place the identity constraints under an element
So you have to add xs:unique to every element, but if you are using XSD 1.1 you can define only once a complete xs:unique and then in the rest of the elements use xs:unique ref="name". This is not valid for you as you are using XSD 1.0, but I let it here for future XSD 1.1 users that find this good question.
Example (namespaces removed for clarity):
<xs:element name="map">
<xs:complexType>
<xs:sequence>
<xs:element name="entry" type="MapEntryType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<!-- Only completely defined once -->
<xs:unique name="uniqueEntry">
<xs:selector xpath="entry"/>
<xs:field xpath="key"/>
</xs:unique>
</xs:element>
<xs:element name="hashMap">
<xs:complexType>
<xs:sequence>
<xs:element name="entry" type="MapEntryType" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<!-- Referenced here and every other time -->
<xs:unique ref="uniqueEntry"/>
</xs:element>
Short answer, it is not possible. Both XSD 1.0 and 1.1 place the identity constraints under an element; a constraint cannot be globally defined, therefore there is no "reuse" per se, other than that of the enclosing element. Given your scenario (different element names for different needs) it is not possible to reuse.

Reference another XML parent complexType?

How should I reference another complexType in xml, as element or as attribute over my own defined Key? What is the correct approach to model the following self-reference? Is the first approach even possible, or does it lead to infinite self-referencing?
<xs:complexType name="Category">
<xs:sequence>
<xs:element name="ParentCategory" type="Category" minOccurs="1" maxOccurs="1"></xs:element>
<xs:element name="ChildCategory" type="Category" minOccurs="0" maxOccurs="unbounded"></xs:element>
</xs:sequence>
<xs:attribute name="CategoryName" type="xs:string"></xs:attribute>
</xs:complexType>
or
<xs:complexType name="Category">
<xs:sequence>
<xs:element name="ChildCategory" type="Category" minOccurs="0" maxOccurs="unbounded"></xs:element>
</xs:sequence>
<xs:attribute name="CategoryName" type="xs:string"></xs:attribute>
<xs:attribute name="ParentCategory" type="xs:string"></xs:attribute>
</xs:complexType>
I'm a bit confused - since I want to be object oriented, but am not sure how this would look like in XML. Wouldn't the reference of ParentCategory as a Category-type require me to again write a Category-type in XML that itself has a ParentCategory child-element, etc... leading to infinite type-referencing.
There's no issue referencing an element of the same type as part of the type definition, so your first example is fine from that point of view. Trying to reference the parent is a bit odd though, you shouldn't really need to do this... XML is hierarchical after all.
<xs:complexType name="Category">
<xs:sequence>
<xs:element maxOccurs="unbounded" minOccurs="0" name="ChildCategory" type="Category"/>
</xs:sequence>
<xs:attribute name="CategoryName" type="xs:string"/>
</xs:complexType>
The Category type references itself recursively, allowing for 0 or more ChildCategory elements. This should do what you need (there's nothing wrong with recursive type referencing in the XML Schema).
If you need to refer to the parent Category in your document, it's easy enough to chain to the parent node in any DOM implementation or with XPath.

Is it possible to have multiple names for a single complex type defined in a XSD?

I am creating an XSD where I defined a complex type:
<xs:complexType name="TimeBasicComponents">
<xs:sequence>
<xs:element name="Hours" type="xs:int"></xs:element>
<xs:element name="Minutes" type="xs:int"></xs:element>
<xs:element name="Seconds" type="xs:int"></xs:element>
<xs:element name="MilliSeconds" type="xs:int"></xs:element>
</xs:sequence>
</xs:complexType>
I defined another complex type:
<xs:complexType name="TimeOfDay">
<xs:sequence>
<xs:element name="BasicComponents" type="TimeBasicComponents"></xs:element>
<xs:element name="Zone" type="xs:string"></xs:element>
</xs:sequence>
</xs:complexType>
Now, I want to have another complex type for time duration. However, there is actually no need to define another complex type for this, it will be exactly same as "TimeBasicComponents". So, I was wondering if there is way to define multiple names for a single complex type in XSD?
-Sandeep
Are you saying that you want to use TimeBasicComponents as a duration also? To my knowledge you can't have alias'es for a complexType but you can achieve something very similar using the <xs:extension... construct:
<xs:complexType name="TimeDuration">
<xs:complexContent>
<xs:extension base="TimeBasicComponents" />
</xs:complexContent>
</xs:complexType>
That way you will effectively have an alias without having to redefine the TimeBasicComponents complex-type.
Cheers,

Difference of mixed="true" and xs:extension in XML Schema

What is the practical diference between these two:
<xs:element name="A">
<xs:complexType mixed="true">
<xs:attribute name="att" type="xs:boolean"/>
</xs:complexType>
</xs:element>
<xs:element name="B">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="att" type="xs:boolean"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
The two are different. Your first example uses mixed="true" which denotes mixed content, i.e. character data mixed in with child elements. Whereas your second example restricts the element content to the xs:string type. Both indicate the presence of an attribute.
With your example, both are practically the same. However, if you do not plan on having mixed content, i.e. you do not plan to add child elements, the second version is much clearer.

distincting xs:choices in xsd by using fixed values for element with enumeration type

Is it possible to distinct xs:choices in xsd by using fixed values? I have a simple type:
<xs:simpleType name="datatypeCategory">
<xs:restriction base="xs:string">
<xs:enumeration value="SIMPLE"/>
<xs:enumeration value="COMPLEX"/>
<xs:enumeration value="COLLECTION"/>
</xs:restriction>
</xs:simpleType>
And what I want to achieve is
<xs:element name="datatype">
<xs:complexType>
<xs:choice>
<xs:sequence>
<xs:element id="category" type="datatypeCategory" fixed="SIMPLE"/>
<!-- some fields specific for SIMPLE -->
</xs:sequence>
<xs:sequence>
<xs:element id="category" type="datatypeCategory" fixed="COMPLEX"/>
<!-- some fields specific for COMPLEX -->
</xs:sequence>
<xs:sequence>
<xs:element id="category" type="datatypeCategory" fixed="COLLECTION"/>
<!-- some fields specific for COLLECTION -->
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:element>
When I do this my XMLSpy tells me:
# The content model of complex type definition '{anonymous}' is ambiguous.
# Details: cos-nonambig: <xs:element name='category'> makes the content model non-deterministic against <xs:element name='category'>. Possible causes: name equality, overlapping occurrence or substitution groups.
You can't do exactly that. The error is because a simple validator that sees a <category> element won't immediately know which branch of the choice to take, and XML Schema 1.0 supports such simple validators.
An alternative would be to name each element according to the category.
<xs:element name="datatype">
<xs:complexType>
<xs:choice>
<xs:sequence>
<xs:element name="simpleCategory" type="empty"/>
<!-- some fields specific for SIMPLE -->
</xs:sequence>
<xs:sequence>
<xs:element name="complexCategory" type="empty"/>
<!-- some fields specific for COMPLEX -->
</xs:sequence>
<xs:sequence>
<xs:element name="collectionCategory" type="empty"/>
<!-- some fields specific for COLLECTION -->
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:element>
where empty is defined as an empty type. Or give them complex types to hold the "specific fields". There are other alternatives depending on your constraints, such as using substitution groups or derived complex types.
In general though, XML Schema 1.0 is not good for constraints based on interrelated values. For that, you have to go to XML Schema 1.1 or an external tool.
IDs must be unique within a document. You can't use the same value on multiple elements:
http://www.w3.org/TR/2006/REC-xml11-20060816/#id

Resources