How are XSD elements combined in an xsd:extension pattern - xsd

I am working on transforming an XSD to a FrameMaker EDD, and I get stuck on the xsd:extension mechanism. As the W3C description of the XSD standard is really complex, I am hoping one of the XSD experts here can give me a hint about this.
Here are two of the definitions in my original XSD:
<xsd:complexType name="basehierarchy">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="num"/>
<xsd:element ref="heading"/>
<xsd:element ref="subheading"/>
</xsd:choice>
</xsd:complexType>
<xsd:complexType name="docContainerType">
<xsd:complexContent>
<xsd:extension base="basehierarchy">
<xsd:choice>
<xsd:element ref="interstitial"/>
<xsd:element ref="toc"/>
<xsd:element ref="documentRef"/>
</xsd:choice>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
I need to resolve extensions before I can create my EDD (and accompanying DTD), but I am not sure what the above patterns should result in. I can imagine various options - one would be to inject the choices of the extension into the choice of the base:
<xsd:complexType name="docContainerType">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="num"/>
<xsd:element ref="heading"/>
<xsd:element ref="subheading"/>
<xsd:element ref="interstitial"/>
<xsd:element ref="toc"/>
<xsd:element ref="documentRef"/>
</xsd:choice>
</xsd:complexType>
As a side effect this would cause the #minOccurs and #maxOccurs to be applied to the elements of the extension pattern. Maybe that is OK but I cannot find explicit information about this. Another option for correct extension of the base pattern would be to add the choice from the extension after the choice of the base:
<xsd:complexType name="docContainerType">
<xsd:sequence>
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="num"/>
<xsd:element ref="heading"/>
<xsd:element ref="subheading"/>
</xsd:choice>
<xsd:choice>
<xsd:element ref="interstitial"/>
<xsd:element ref="toc"/>
<xsd:element ref="documentRef"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
And if the second option is the correct one, should the extension come before or after the base elements?

Maybe the recommendation can give you a clue : XML Schema Part 0: Primer Second Edition, ยง4.2 Deriving Types by Extension, especially this part of the text:
When a complex type is derived by extension, its effective content model is the content model of the base type plus the content model specified in the type derivation. Furthermore, the two content models are treated as two children of a sequential group.

Related

Cannot figure out a way to create XML schema that matches random order items with conditions

We're trying to find a way to have a schema that would validate certain rules, but we've tried various combinations of xs:all, xs:choice, xs:group and xs:sequence with no success. The rules are basically this:
only one occurance of the LICAPPIN01 element should occur
only one occurance of the LICAPPIN99 element should occur
there should be the same number of LICAPPIN30 and LICAPPIN31
there should be the same number of LICAPPIN40 and LICAPPIN41
there needs to be at least one set of LICAPPIN30/31 or LICAPPIN40/41 (both can be there as well)
For all of the above, the order does not matter -- any order is acceptable
The simplest schema we tried is this:
<?xml version="1.0" standalone="yes"?>
<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="NewDataSet">
<xs:complexType>
<xs:choice minOccurs="1" maxOccurs="unbounded">
<xs:element name="LICAPPIN01" minOccurs="1" maxOccurs="1">
</xs:element>
<xs:element name="LICAPPIN30" minOccurs="1" maxOccurs="unbounded">
</xs:element>
<xs:element name="LICAPPIN31" minOccurs="1" maxOccurs="unbounded">
</xs:element>
<xs:element name="LICAPPIN40" minOccurs="1" maxOccurs="unbounded">
</xs:element>
<xs:element name="LICAPPIN41" minOccurs="1" maxOccurs="unbounded">
</xs:element>
<xs:element name="LICAPPIN99" minOccurs="1" maxOccurs="1">
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
This has a number of problems:
it allows multiple LICAPPIN01 and LICAPPIN99 (replacing with xs:all might fix this?)
it does not enforce rule 3 and 4
for rule 5, it seems to force both LICAPPIN30/31 and LICAPPIN40/41 when it should be possible to only have one of the two sets
We also tried a more complex approach with xs:group for LICAPPIN30/31 and for LICAPPIN40/41 but it broke rule 6.
Any idea if this is even possible to meet all of our basic rules? In a relatively simple Schema. In the example above, I removed all of the details within each LICAPPINnn elements -- they each contain complex types, and we don't want to have to duplicate these in multiple places, ideally.
Thanks,
Denis
It's not easy to write a content model to meet all your requirements, but it's easy to meet all but the last.
If variation in the order of elements is essential to convey necessary information, then your best bet is to use assertions in XSD 1.1 or Schematron. If variation in the order of elements conveys no information, then you have the option of declaring that variation in order is not a requirement after all. The vocabulary design authorities I respect most highly say pretty consistently that if the sequence of children does not convey information, then there is no reason not to fix it.
Here is a content model that meets all the requirements you list except the last one:
<xs:complexType>
<xs:sequence>
<xs:element name="LICAPPIN01"/>
<xs:choice maxOccurs="unbounded">
<xs:sequence>
<xs:element name="LICAPPIN30"/>
<xs:element name="LICAPPIN31"/>
</xs:sequence>
<xs:sequence>
<xs:element name="LICAPPIN40"/>
<xs:element name="LICAPPIN41"/>
</xs:sequence>
</xs:choice>
<xs:element name="LICAPPIN99"/>
</xs:sequence>
</xs:complexType>

Semantic difference between element and complexType in XSD

Given this XSD:
<xsd:element name="ServiceList">
<xsd:complexType>
<xsd:sequence>
...
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="ServiceList">
<xsd:sequence>
...
</xsd:sequence>
</xsd:complexType>
What is kind of the semantic difference between these two? I.e. named elements and complexTypes which are direct children of a schema.
The reason for me asking is that I tried doing this in an XSD:
<xsd:element name="AvailableServices" type="cm:ServiceList" />
<xsd:element name="ExistingServices" type="cm:ServiceList" />
<xsd:complexType name="ServiceList">
<xsd:sequence>
...
</xsd:sequence>
</xsd:complexType>
But when this was compiled into Java classes using the Maven JAXB plugin, I am only able to create a new ServiceList(). AvailableServices and ExistingServices doesn't seem to even exist among the generated classes. So, what's going on here?
Classes Correspond to Complex Types
In JAXB (JSR-222) Java classes correspond to complex types. Named complex types and anonymous complex types of global elements correspond to root level classes. Nested complex types by default are generated as static inner classes. You can change this default behaviour:
https://stackoverflow.com/a/13175419/383861
Global Elements
If a global element is uniquely associated with a complex type (global element with anonymous complex type) it will be annotated with #XmlRootElement. Global elements that correspond to global types will correspond to #XmlElementDecl annotations in the ObjectFactory class.
For More Information
http://blog.bdoughan.com/2012/07/jaxb-and-root-elements.html

How to use the xml schema group element

I am trying to design an XML structure to capture the output from a spreadsheet which contains a Customer Name and many different amount columns. And there is a total row as well.
I have about 4 amounts column definitions that I want to reuse as a group. So, I declared a group called AmountsGroup and then used the Group Name as a 'ref' attribute inside my complex type definition. Here is how it looks like
<xsd:complexType name="AmountByCustomerType">
<xsd:sequence>
<xsd:element name="Customer" type="xsd:string" />
<xsd:group ref="AmountsGroup" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="AmountByCustomerTotalType">
<xsd:sequence>
<xsd:element name="Total" type="xsd:string" />
<xsd:group ref="AmountsGroup" />
</xsd:sequence>
</xsd:complexType>
<xsd:group name="AmountsGroup">
<xsd:sequence>
<xsd:element name="AmountByPeriod" type="AmountByPeriodType" maxOccurs="unbounded" />
<xsd:element name="NetAdjustments" type="xsd:decimal" />
<xsd:element name="OriginalSalesAmount" type="xsd:decimal" minOccurs="0"/>
<xsd:element name="RevisedAmount" type="xsd:decimal" />
</xsd:sequence>
</xsd:group>
Here are my questions:
I have declared the group as having maxOccurs="unbounded" in the first complexType where in the second complexType I have left it out meaning it will have to occur only once. Will this work correctly? I want many rows of customer amount and only one total amount row.
The XML instance document will not need to have the name of this group name anywhere - is that correct?
Is there any better way to structure the individual rows and total type of structure?
Is this a good practice when I use Venetian Blinds Pattern? I don't want to declare a complexType since then I have to declare an element which will appear in the XML instance document, thus adding one more level to the XML object tree. Is there any way to use a named Type without giving it an element on its own? I hope you understand what I am trying to do.
Any thoughts?
Correct, maxOccurs applies to the group as a whole.
Correct, group name is in the schema only.
I was going to suggest introducing an element to encapsulate the group members, but I see from your 4th question you're trying to avoid that. I prefer it since it makes it easier for a parser to identify the start and end of each "row" and mirrors programming encapsulation.
Seems reasonable; you're still keeping with the Venetian Blinds spirit of reusable components without committing to a namespace for local elements.

Should we declare a simple type explicitly even for a string type in venetian blinds pattern

I am using the venetian blinds pattern to design my XML schema and it requires that all the types are declared at the global level and all the elements use the types defined in the global scope.
My question is this:
If I want to declare 2 elements which are simple strings with no other restriction, should I declare them in the global scope and then use them? Or can I directly declare a simple type inside the element itself? Am I breaking the venetial blinds in the second scenario I listed below?
For example, I can do one of the two:
<xsd:schema>
<xsd:simpleType name="ApplicantName">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
<xsd:simpleType name="ApplicantCountry">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
<xsd:element name="Application">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="ApplicantName" type="ApplicantName"/>
<xsd:element name="ApplicantCountry" type="ApplicantCountry"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Or I can use this.
<xsd:schema>
<xsd:element name="Application">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="ApplicantName" type="xsd:string"/>
<xsd:element name="ApplicantCountry" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Well, why did you choose to follow this pattern? Which option provides the benefits that are promised by the pattern? Answer those questions and I think you have your answer.
It seems to me that the pattern calls for the first approach. Whether the pattern actually has value, or whether it should be followed so rigorously is for you to decide. At the heart of the matter is the question of what you are trying to achieve by using the pattern in the first place.
I'd say: It depends. The goal of Venetian Blinds is to reuse types but unless some of your elements share a common restriction like, for example, field length imposed by a backend database you won't gain anything from following this pattern religiously.

XSD Design - One or more rule

I am designing a new XSD to capture points information from a business partner. For each transaction the partner must provide a value of points for at least one points type. I have the following:
<xs:element name="Points">
<xs:complexType>
<xs:sequence>
<xs:element name="SKUPointsQty" type="xs:int" minOccurs="0"/>
<xs:element name="WelcomePointsQty" type="xs:int" minOccurs="0"/>
<xs:element name="ManualPointsQty" type="xs:int" minOccurs="0"/>
<xs:element name="GreenPointQty" type="xs:int" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The business rules are:
a transaction must provide points from one or more of the points type
a transaction cannot provide more than one instance of the same points type
What I have so far is not ideal because it would be possible to provide an XML instance without any points. I can't use a choice element because it must be possible to provide an XML instance with more that one points type element. The same point type must not be repeated for a single transaction.
Is it possible to enforce this rule in the design of the XSD?
I have a copy of the excellent XML Schema Companion by Neil Bradley. I can't find the answer in there so I guess it's not possible but thought I'd share the puzzle!
Thanks
Rob.
I think this kind of constraint logic is beyond XSD. Here are three techniques for checking instance documents for constraints that are not expressable by XML Schemas.
* a transaction cannot provide more than one instance of the same
points type
That's fairly easy - and you already have that, basically.
Since your "inner" elements like
<xs:element name="ManualPointsQty" type="xs:int" minOccurs="0"/>
are defined as they are, you make them optional (minOccurs="0"), and by default since you didn't specify anything else, they also have a maxOccurs="1" setting.
So that half of the requirements should be taken care of.
a transaction must provide points from one or more of the points
type
That's the part where XML schema is not helping you much - you cannot express requirements like this in XSD. XSD only lends itself to "structural" modelling - things like "include this", "include 1 through 5 of these" - but you cannot express limitations that "span" more than one element like "if A is present, then B cannot be present", or "if A is present, then the value of B must be between 10 and 100". The "at least one of the four types must be present" also falls into that category, unfortunately :-( No luck there.
Since its a sequence, could you have a choice of four forms, depending on the first element present?
<xs:element name="Points">
<xs:complexType>
<xs:choice>
<xs:sequence>
<xs:element name="a" type="xs:int" />
<xs:element name="b" type="xs:int" minOccurs="0"/>
<xs:element name="c" type="xs:int" minOccurs="0"/>
<xs:element name="d" type="xs:int" minOccurs="0"/>
</xs:sequence>
<xs:sequence>
<xs:element name="b" type="xs:int" />
<xs:element name="c" type="xs:int" minOccurs="0"/>
<xs:element name="d" type="xs:int" minOccurs="0"/>
</xs:sequence>
<xs:sequence>
<xs:element name="c" type="xs:int" />
<xs:element name="d" type="xs:int" minOccurs="0"/>
</xs:sequence>
<xs:sequence>
<xs:element name="d" type="xs:int" />
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:element>

Resources