How to use the xml schema group element - xsd

I am trying to design an XML structure to capture the output from a spreadsheet which contains a Customer Name and many different amount columns. And there is a total row as well.
I have about 4 amounts column definitions that I want to reuse as a group. So, I declared a group called AmountsGroup and then used the Group Name as a 'ref' attribute inside my complex type definition. Here is how it looks like
<xsd:complexType name="AmountByCustomerType">
<xsd:sequence>
<xsd:element name="Customer" type="xsd:string" />
<xsd:group ref="AmountsGroup" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="AmountByCustomerTotalType">
<xsd:sequence>
<xsd:element name="Total" type="xsd:string" />
<xsd:group ref="AmountsGroup" />
</xsd:sequence>
</xsd:complexType>
<xsd:group name="AmountsGroup">
<xsd:sequence>
<xsd:element name="AmountByPeriod" type="AmountByPeriodType" maxOccurs="unbounded" />
<xsd:element name="NetAdjustments" type="xsd:decimal" />
<xsd:element name="OriginalSalesAmount" type="xsd:decimal" minOccurs="0"/>
<xsd:element name="RevisedAmount" type="xsd:decimal" />
</xsd:sequence>
</xsd:group>
Here are my questions:
I have declared the group as having maxOccurs="unbounded" in the first complexType where in the second complexType I have left it out meaning it will have to occur only once. Will this work correctly? I want many rows of customer amount and only one total amount row.
The XML instance document will not need to have the name of this group name anywhere - is that correct?
Is there any better way to structure the individual rows and total type of structure?
Is this a good practice when I use Venetian Blinds Pattern? I don't want to declare a complexType since then I have to declare an element which will appear in the XML instance document, thus adding one more level to the XML object tree. Is there any way to use a named Type without giving it an element on its own? I hope you understand what I am trying to do.
Any thoughts?

Correct, maxOccurs applies to the group as a whole.
Correct, group name is in the schema only.
I was going to suggest introducing an element to encapsulate the group members, but I see from your 4th question you're trying to avoid that. I prefer it since it makes it easier for a parser to identify the start and end of each "row" and mirrors programming encapsulation.
Seems reasonable; you're still keeping with the Venetian Blinds spirit of reusable components without committing to a namespace for local elements.

Related

How are XSD elements combined in an xsd:extension pattern

I am working on transforming an XSD to a FrameMaker EDD, and I get stuck on the xsd:extension mechanism. As the W3C description of the XSD standard is really complex, I am hoping one of the XSD experts here can give me a hint about this.
Here are two of the definitions in my original XSD:
<xsd:complexType name="basehierarchy">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="num"/>
<xsd:element ref="heading"/>
<xsd:element ref="subheading"/>
</xsd:choice>
</xsd:complexType>
<xsd:complexType name="docContainerType">
<xsd:complexContent>
<xsd:extension base="basehierarchy">
<xsd:choice>
<xsd:element ref="interstitial"/>
<xsd:element ref="toc"/>
<xsd:element ref="documentRef"/>
</xsd:choice>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
I need to resolve extensions before I can create my EDD (and accompanying DTD), but I am not sure what the above patterns should result in. I can imagine various options - one would be to inject the choices of the extension into the choice of the base:
<xsd:complexType name="docContainerType">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="num"/>
<xsd:element ref="heading"/>
<xsd:element ref="subheading"/>
<xsd:element ref="interstitial"/>
<xsd:element ref="toc"/>
<xsd:element ref="documentRef"/>
</xsd:choice>
</xsd:complexType>
As a side effect this would cause the #minOccurs and #maxOccurs to be applied to the elements of the extension pattern. Maybe that is OK but I cannot find explicit information about this. Another option for correct extension of the base pattern would be to add the choice from the extension after the choice of the base:
<xsd:complexType name="docContainerType">
<xsd:sequence>
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="num"/>
<xsd:element ref="heading"/>
<xsd:element ref="subheading"/>
</xsd:choice>
<xsd:choice>
<xsd:element ref="interstitial"/>
<xsd:element ref="toc"/>
<xsd:element ref="documentRef"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
And if the second option is the correct one, should the extension come before or after the base elements?
Maybe the recommendation can give you a clue : XML Schema Part 0: Primer Second Edition, ยง4.2 Deriving Types by Extension, especially this part of the text:
When a complex type is derived by extension, its effective content model is the content model of the base type plus the content model specified in the type derivation. Furthermore, the two content models are treated as two children of a sequential group.

Vectors of a complex type

Is there a way to define the cardinality of a type at the place where that type is referenced?
<xs:complexType name="xyType">
<xs:element name="xy" maxOccurs="1">
<xs:choice maxOccurs="1" minOccurs="0">
<xs:complexType>
<xs:choice maxOccurs="unbounded" minOccurs="0">
...
</xs:choice>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
So for instance I have two types A and B that have elements that reference this type, but in one case I only allow one xy (like above) and another I would like to allow multiple xy (like if I change the maxOccurs above for xy to "unbounded").
I don't want to have to completely separate complexType definitions for xyType (single) and xyType (unbounded), because in reality the definition for this type is very long and complex.
If possible I would also like to not define too many types (like separating the inner complexType from the body and having two types referencing that type). This would also be very complex in my specific scenario (I have a complex class hierarchy that I try to define with a schema, so everything is bloated already).
So basically I'm looking for something where the type that is referencing this type is taking care about the cardinality if that makes sense at all.
I would suggest that you modularize the parts of xyType as best as possible for sharing across two types, say xyType_A that allows only one xy and xyType_B that allows an unbounded number of xys. (Of course choose semantically appropriate names rather than these stand-ins.)
For example, xyType_A and xyType_B could differ in their definitions of xy's cardinality yet share the complex machinery defined in commonType:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="xyType_A">
<xs:sequence>
<xs:element name="xy" type="commonType" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="xyType_B">
<xs:sequence>
<xs:element name="xy" type="commonType" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="commonType">
<xs:choice maxOccurs="1" minOccurs="0">
<xs:sequence>
<xs:choice maxOccurs="unbounded" minOccurs="0">
<!-- further complicated structures continue here -->
</xs:choice>
<!-- and here or wherever -->
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:schema>
The principle (if not the magnitude of opportunity) would be the same if the elements of varying cardinality are deeper in the definitional hierarchy: Factor as much of the common definitional components as possible, and reuse those in the distinctly defined types.
This wouldn't work in XSD 1.0. You could use Schematron (on top of the XSD 1.0); it would work with no issues.
It is possible in XSD 1.1. It would require a bit of work, at least based on my understanding. The solution is to use assertions; however, they seem to be supported for complex and simple types only, which means you may still need to introduce two new types specific to element A and B; however, they would simply be extending xyType (100% reuse), for the purpose of providing a place to define the assertion specific to A and B.
If you're interested in either alternative, tag the question appropriately.

Is it preferred to define a separate plural complexType for multiple singular elements

Is there any established standard for inlining trivial plural complexTypes vs. defining them separately?
In detail: When defining some XML schemas I frequently encounter cases where I want one element to contain multiple child elements of the same single type. For example a schema which describes a table in a database has a fields element which can contain one or more field elements. I can either create an inline complexType within the definition of the plural fields element:
<xs:element name="fields" minOccurs="1" maxOccurs="1">
<xs:complexType>
<xs:sequence>
<xs:element name="field" type="table-field"
minOccurs="1" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
Or I can separately define a trivial fields type and use that:
<xs:element name="fields" type="table-field-collection" minOccurs="1" maxOccurs="1">
<!-- Elsewhere: -->
<xs:complexType name="table-field-collection">
<xs:sequence>
<xs:element name="field" type="table-field" minOccurs="1" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
The first approach creates a slightly more messy markup with anonymous types, while the second creates lots of extra trivial complexTypes. Is there a concensus on which approach is preferred?
There isn't really an established standard for this. There are really three choices:
"fields" must be defined as a complex type and reused (table-field-collection above)
"fields" is an element with an anonymous sub-type
There is no fields element. Instead, "field" simply repeats within the parent element.
I have specified modelling guidelines for a number of firms and used all of these patterns. More recently, I'm tending towards the third - the encapsulating fields element does not really have any semantic meaning, other than making a nice grouping when viewing documents in some graphical tools. If you were to process this using something like JAXB, you'd probably annoyed that fields is there - one more thing that can be null.
If you want to ask yourself the one relevant question from a technical point of view, then it is this: do you want to be able to inherit from table-field-collection and override it using xsi:type, or reuse it? If yes, go for the complex type. If no, go for whatever you prefer style-wise.

Semantic difference between element and complexType in XSD

Given this XSD:
<xsd:element name="ServiceList">
<xsd:complexType>
<xsd:sequence>
...
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="ServiceList">
<xsd:sequence>
...
</xsd:sequence>
</xsd:complexType>
What is kind of the semantic difference between these two? I.e. named elements and complexTypes which are direct children of a schema.
The reason for me asking is that I tried doing this in an XSD:
<xsd:element name="AvailableServices" type="cm:ServiceList" />
<xsd:element name="ExistingServices" type="cm:ServiceList" />
<xsd:complexType name="ServiceList">
<xsd:sequence>
...
</xsd:sequence>
</xsd:complexType>
But when this was compiled into Java classes using the Maven JAXB plugin, I am only able to create a new ServiceList(). AvailableServices and ExistingServices doesn't seem to even exist among the generated classes. So, what's going on here?
Classes Correspond to Complex Types
In JAXB (JSR-222) Java classes correspond to complex types. Named complex types and anonymous complex types of global elements correspond to root level classes. Nested complex types by default are generated as static inner classes. You can change this default behaviour:
https://stackoverflow.com/a/13175419/383861
Global Elements
If a global element is uniquely associated with a complex type (global element with anonymous complex type) it will be annotated with #XmlRootElement. Global elements that correspond to global types will correspond to #XmlElementDecl annotations in the ObjectFactory class.
For More Information
http://blog.bdoughan.com/2012/07/jaxb-and-root-elements.html

Should we declare a simple type explicitly even for a string type in venetian blinds pattern

I am using the venetian blinds pattern to design my XML schema and it requires that all the types are declared at the global level and all the elements use the types defined in the global scope.
My question is this:
If I want to declare 2 elements which are simple strings with no other restriction, should I declare them in the global scope and then use them? Or can I directly declare a simple type inside the element itself? Am I breaking the venetial blinds in the second scenario I listed below?
For example, I can do one of the two:
<xsd:schema>
<xsd:simpleType name="ApplicantName">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
<xsd:simpleType name="ApplicantCountry">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
<xsd:element name="Application">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="ApplicantName" type="ApplicantName"/>
<xsd:element name="ApplicantCountry" type="ApplicantCountry"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Or I can use this.
<xsd:schema>
<xsd:element name="Application">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="ApplicantName" type="xsd:string"/>
<xsd:element name="ApplicantCountry" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Well, why did you choose to follow this pattern? Which option provides the benefits that are promised by the pattern? Answer those questions and I think you have your answer.
It seems to me that the pattern calls for the first approach. Whether the pattern actually has value, or whether it should be followed so rigorously is for you to decide. At the heart of the matter is the question of what you are trying to achieve by using the pattern in the first place.
I'd say: It depends. The goal of Venetian Blinds is to reuse types but unless some of your elements share a common restriction like, for example, field length imposed by a backend database you won't gain anything from following this pattern religiously.

Resources