Using a single element declaration in multiple files - xsd

Suppose we have some .xml files containing, amongst other things, MIDI note data.
Since MIDI note values must be bounded integers (they cannot be negative and must be less than or equal to some maximum value, e.g. 108) we want to set up some .xsd files to help validate the files while enforcing our bounded integer rule.
Is there any mechanism available that would allow me to enforce the bounds of 0 and 108, or perhaps even a midi "type", but in such a way so that I only have to type it out once, and only once?
Including the code snippet below for every MIDI element in every schema file is bad for all the obvious reasons - it's tiresome, error-prone, difficult to maintain, etc.
<xs:element name="note">
<xs:simpleType>
<xs:restriction base="xs:positiveInteger">
<xs:maxExclusive value="108"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
I'm afraid I'm missing some basic understanding / terminology to be able to get an answer to this question from Mr Google.

Yes, declare a named type, then refer to it:
<xs:element name="note" type="NoteType"/>
<xs:simpleType name="NoteType>
<xs:restriction base="xs:positiveInteger">
<xs:maxExclusive value="108"/>
</xs:restriction>
</xs:simpleType>
You can refer to NoteType as many times as you need.

Related

When should a complex type be declared by directly naming the element, as opposed to using the type attribute?

http://www.w3schools.com/schema/schema_complex.asp
In this following snippit, why should the first way ever be used over the second?
We can define a complex element in an XML Schema two different ways:
A. The "employee" element can be declared directly by naming the element, like this:
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
If you use the method described above, only the "employee" element can
use the specified complex type. Note that the child elements,
"firstname" and "lastname", are surrounded by the
indicator. This means that the child elements must appear in the same
order as they are declared. You will learn more about indicators in
the XSD Indicators chapter.
B. The "employee" element can have a type attribute that refers to the name of the complex type to use:
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
If you use the method described above, several elements can refer to
the same complex type, like this:
<xs:element name="employee" type="personinfo"/>
<xs:element name="student" type="personinfo"/>
<xs:element name="member" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
Why should the first way ever be used over the second?
Local types can be useful for elements which should not be reused outside of some specific context. It would make sense, for example, for elements representing table cells to be local to the types used for table rows, and for the declaration of table-row elements to be local to the type used for the element representing the table as a whole. (An element representing a table row does not -- on this account -- make any sense outside the context of a table. Making declarations local is a simple way of ensuring that elements which place particular demands on their contexts can only be used in those contexts.)
Local types in XSD can also (like local types in other languages) also be useful in avoiding name collisions. If my vocabulary provides for letters to have a salutation tagged salutation, and also provides for database-like information about people in which their names, addresses, and the preferred form of address (tagged salutation) are recorded, the two elements named salutation are likely to be regarded as wholly unrelated to each other; making one or both of them local allows them both to exist within a vocabulary. (Namespaces can also be used for this purpose, but I have met few vocabulary designers who would want to put these two salutation elements into different namespaces, and even fewer XML users who would greet that prospect with anything but distaste.)
If you're not interested in preventing re-use, stressing the semantic dependency of an element on its parent, or avoiding name collisions, then there isn't much reason to use local elements. (That said, many people do use them quite a lot, and perhaps they have reasons I don't understand. From where I sit, it just seems that many people overuse local declarations for no good reason at all.)
Some GUI XSD editors only support the directly declaring complex types although you can quite often manually create the types if you do want to be able to reuse complex types.
So in that situation I would only go for declaring re-usable complex types if there is reuse of types just because it is easier not to declare reusable complex types.

Vectors of a complex type

Is there a way to define the cardinality of a type at the place where that type is referenced?
<xs:complexType name="xyType">
<xs:element name="xy" maxOccurs="1">
<xs:choice maxOccurs="1" minOccurs="0">
<xs:complexType>
<xs:choice maxOccurs="unbounded" minOccurs="0">
...
</xs:choice>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
So for instance I have two types A and B that have elements that reference this type, but in one case I only allow one xy (like above) and another I would like to allow multiple xy (like if I change the maxOccurs above for xy to "unbounded").
I don't want to have to completely separate complexType definitions for xyType (single) and xyType (unbounded), because in reality the definition for this type is very long and complex.
If possible I would also like to not define too many types (like separating the inner complexType from the body and having two types referencing that type). This would also be very complex in my specific scenario (I have a complex class hierarchy that I try to define with a schema, so everything is bloated already).
So basically I'm looking for something where the type that is referencing this type is taking care about the cardinality if that makes sense at all.
I would suggest that you modularize the parts of xyType as best as possible for sharing across two types, say xyType_A that allows only one xy and xyType_B that allows an unbounded number of xys. (Of course choose semantically appropriate names rather than these stand-ins.)
For example, xyType_A and xyType_B could differ in their definitions of xy's cardinality yet share the complex machinery defined in commonType:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="xyType_A">
<xs:sequence>
<xs:element name="xy" type="commonType" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="xyType_B">
<xs:sequence>
<xs:element name="xy" type="commonType" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="commonType">
<xs:choice maxOccurs="1" minOccurs="0">
<xs:sequence>
<xs:choice maxOccurs="unbounded" minOccurs="0">
<!-- further complicated structures continue here -->
</xs:choice>
<!-- and here or wherever -->
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:schema>
The principle (if not the magnitude of opportunity) would be the same if the elements of varying cardinality are deeper in the definitional hierarchy: Factor as much of the common definitional components as possible, and reuse those in the distinctly defined types.
This wouldn't work in XSD 1.0. You could use Schematron (on top of the XSD 1.0); it would work with no issues.
It is possible in XSD 1.1. It would require a bit of work, at least based on my understanding. The solution is to use assertions; however, they seem to be supported for complex and simple types only, which means you may still need to introduce two new types specific to element A and B; however, they would simply be extending xyType (100% reuse), for the purpose of providing a place to define the assertion specific to A and B.
If you're interested in either alternative, tag the question appropriately.

Why can extensions only be placed in simpleContent and complexContent containers?

I'm having difficulty understanding some of the nuances of the format for defining type extensions and restrictions in XSD. According to the W3Schools reference:
simpleContent defines "extensions or restrictions on a text-only complex type or on a simple type as content and contains no elements"
complexContent defines "extensions or restrictions on a complex type that contains mixed content or elements only
What isn't clear to me is why XSD requires extensions and restrictions to be contained in one of these containers, and furthermore, why only extensions and restrictions require it. It would make a little more sense to me if all 'content' had to be defined in the container, but this is not the case - with base types, the content (sequences, etc.) are defined as direct children of the complexType container.
Take this example, which to me seems overly verbose:
<xs:complexType name="fullpersoninfo">
<xs:complexContent>
<xs:extension base="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Why is it not possible to write it like this instead?
<xs:complexType name="fullpersoninfo">
<xs:extension base="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexType>
Or even like this?
<xs:complexType name="fullpersoninfo" extends="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
I'm assuming there must be some reason it was defined the way it was, but I can't find any clues as to why.
I don't think you're going to find any useful design rationales for the XML syntax of complex types. Suffice it to say that those designing the XML syntax managed, by means of the elements you mention, to solve some technical difficulties, and that no obviously better syntax commanded consensus in the working group. You may wonder what technical difficulties are solved by simpleContent and complexContent, and that's a reasonable question, but I doubt anyone is going to be willing to undertake the excursion into the design records of XSD that would be necessary to answer it.
One simple observation: the legal children of extension and restriction vary depending on whether the parent is simpleContent or complexContent. That is accomplished using declarations local to the types of simpleContent and complexContent and would not be possible without them -- at least, not without a very thorough redesign of the XML syntax.
To build on C. M. Sperberg-McQueen's answer, I would think that some (if not more) had to do with the limitations of the language (I guess the "technical difficulties" reference); since most grammars try to prove that they're good enough to define themselves, imagine how little could've actually be done in the "schema for schema", considering the limitations we still "enjoy" today in version 1.0.
Many people believe that they could truly validate an XSD by validating the XML that is XSD against the XMLSchema.xsd - it is not the case.
Many XML Schema designs raise the same question as yours; the answer is typically that the author wanted to maximize the constraining capabilities of their schema spec by working around limitations in the language.
Somehow I believe that if the features in 1.0 would have been similar to 1.1, the syntax would've been different; the spec wouldn't have been easier to understand...
To make this richer, I would also explore other schema language specifications, such as RelaxNG or Schematron; maybe some argumentative discussions... A good reading is probably Rick Jelliffe take on XSD.

Is it preferred to define a separate plural complexType for multiple singular elements

Is there any established standard for inlining trivial plural complexTypes vs. defining them separately?
In detail: When defining some XML schemas I frequently encounter cases where I want one element to contain multiple child elements of the same single type. For example a schema which describes a table in a database has a fields element which can contain one or more field elements. I can either create an inline complexType within the definition of the plural fields element:
<xs:element name="fields" minOccurs="1" maxOccurs="1">
<xs:complexType>
<xs:sequence>
<xs:element name="field" type="table-field"
minOccurs="1" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
Or I can separately define a trivial fields type and use that:
<xs:element name="fields" type="table-field-collection" minOccurs="1" maxOccurs="1">
<!-- Elsewhere: -->
<xs:complexType name="table-field-collection">
<xs:sequence>
<xs:element name="field" type="table-field" minOccurs="1" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
The first approach creates a slightly more messy markup with anonymous types, while the second creates lots of extra trivial complexTypes. Is there a concensus on which approach is preferred?
There isn't really an established standard for this. There are really three choices:
"fields" must be defined as a complex type and reused (table-field-collection above)
"fields" is an element with an anonymous sub-type
There is no fields element. Instead, "field" simply repeats within the parent element.
I have specified modelling guidelines for a number of firms and used all of these patterns. More recently, I'm tending towards the third - the encapsulating fields element does not really have any semantic meaning, other than making a nice grouping when viewing documents in some graphical tools. If you were to process this using something like JAXB, you'd probably annoyed that fields is there - one more thing that can be null.
If you want to ask yourself the one relevant question from a technical point of view, then it is this: do you want to be able to inherit from table-field-collection and override it using xsi:type, or reuse it? If yes, go for the complex type. If no, go for whatever you prefer style-wise.

how to define any element in step size in XML schema

if i want to define any element in XML schema like min value is 0 and max value is 91800 in step 360 means possible combination are 0,360,720 and so on without using enumeration pattern
how i can define this?
I cannot think of any way to do it - you cannot do arithmetic in validation rules.
You're stuck with using enumeration (that in your case seems possible - it is 256 possible values if I am not mistaken).
Since a finite state automaton can recognize the set of numbers evenly divisible by 360, it's possible in principle to do this with a fiendishly complicated regular expression, but for the range you have in mind an enumeration would in fact be a lot easier to understand (and to write correctly).
So in XSD 1.0, it's not quite true that using an enumeration is the only way to define the type you want, but it is true that it's by far the simplest and best way.
In XSD 1.1, you can use an assertion expressed in XPath 2.0 to capture the arithmetic relation:
<xs:simpleType name="small-multiples-of-three-sixty">
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="91800"/>
<xs:assertion test="$value mod 360 eq 0"/>
</xs:restriction>
</xs:simpleType>

Resources