XSD variable element names - xsd

Is it possible in XSD to name an element dynamically?
I have a complexType with a varying (but limited) number of elements (x-n), each of which has a complicated substructure. I can copy and paste one (x-1) and just change the number for the name of each of the copies (x-2, x3, and so on), but it'd be cleaner if I didn't have to.
For example, as it is now:
<xs:schema attributeFormDefault="unq1" elementFormDefault="q1" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="r1">
<xs:complexType><xs:sequence>
<xs:element name="w-s">
<xs:complexType><xs:sequence maxOccurs="4">
<xs:element name="x-1" minOccurs="0">
<xs:complexType><xs:sequence>
<!-- long tedious substructure goes here -->
</xs:sequence></xs:complexType>
</xs:element>
<xs:element name="x-2" minOccurs="0">
<xs:complexType><xs:sequence>
<!-- long tedious substructure goes here -->
</xs:sequence></xs:complexType>
</xs:element>
<xs:element name="x-3" minOccurs="0">
<xs:complexType><xs:sequence>
<!-- long tedious substructure goes here -->
</xs:sequence></xs:complexType>
</xs:element>
<xs:element name="x-4" minOccurs="0">
<xs:complexType><xs:sequence>
<!-- long tedious substructure goes here -->
</xs:sequence></xs:complexType>
</xs:element>
</xs:sequence></xs:complexType>
</xs:element>
</xs:sequence></xs:complexType>
</xs:element>
</xs:schema>
Looking through w3schools now (https://www.w3schools.blog/xsd-xml-schema-definition-tutorial), the answer is not jumping out at me yet, and it's starting to look like the answer is "No.".

I can copy and paste one (x-1) and just change the number
If each element x-n has the same content then you should use the same tag name for all of these tags. That will require a change in the XML format, so I would also recommend that you stop using this style:
<element name='x-n' ...>
...and start using this instead:
<x index="n">
This will make your life much easier because XML schema expects the tag name to indicate the type of content.
I understand that you may not be able to change the XML format, but I think it's important to point out that your current style of XML is not best practice.

Judging by these stackoverflows, the answer appears to be "Not only no, but also you're wrong for asking" :-p
xml schema list of incremental element name
Can't design xsd schema - because of a variable element name
XML variable tag names
The better solution is to restructure the XSD so that the x-n element has a sub-element indicating the value of n, like so:
<xs:schema attributeFormDefault="unq1" elementFormDefault="q1" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="r1">
<xs:complexType><xs:sequence>
<xs:element name="w-s">
<xs:complexType><xs:sequence maxOccurs="4">
<xs:element name="x" minOccurs="0">
<xs:complexType><xs:sequence>
<xs:element name="x-number">
<!-- long tedious substructure goes here -->
</xs:sequence></xs:complexType>
</xs:element>
</xs:sequence></xs:complexType>
</xs:element>
</xs:sequence></xs:complexType>
</xs:element>
</xs:schema>

Related

DFDL Schema for parsing delimited text message

Need small help for DFDL. I need to parse below message as something like XML/tree structure. Elements are not fixed and dynamic. Sometime some other elements will appear.
XML/Tree output expected as something below
<root>
<CLIENT_ID>DESKTOPCLIENT</CLIENT_ID>
<LOCALE>en-US</LOCALE>
<ENCODE/>
</root>
Something like this is a possible solution, tested in Daffodil:
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/">
<xs:include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd" />
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:format
ref="GeneralFormat"
lengthKind="delimited"
/>
</xs:appinfo>
</xs:annotation>
<xs:element name="root" dfdl:initiator="%ESC;" dfdl:terminator="%SUB;">
<xs:complexType>
<xs:sequence dfdl:separator="%CAN;" dfdl:separatorPosition="prefix" dfdl:sequenceKind="unordered">
<xs:element name="CLIENT_ID" type="xs:string" dfdl:initiator="CLIENT_ID%NAK;" />
<xs:element name="LOCALED" type="xs:string" dfdl:initiator="LOCALE%NAK;" />
<xs:element name="ENCODE" type="xs:string" dfdl:initiator="ENCODE%NAK;" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Note that this assumes fixed names for the individual elements, and that they all exist, though the order does not matter. If you know the fixed names, but they may or may not exist, you can add minOccurs="0" to the elements in the unorderd sequence.
However, DFDL does not allow for dynami element names, so if you don't know the names, you need a slightly different schema. Instead, you need to describe the data as an unbouned number of name/value pairs, where the name and value are separated by %NAK;, for example:
<xs:element name="root" dfdl:initiator="%ESC;" dfdl:terminator="%SUB;">
<xs:complexType>
<xs:sequence dfdl:separator="%CAN;" dfdl:separatorPosition="prefix">
<xs:element name="element" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence dfdl:separator="%NAK;" dfdl:separatorPosition="infix">
<xs:element name="name" type="xs:string" />
<xs:element name="value" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
This results in an infoset that looks something like this:
<root>
<element>
<name>CLIENT_ID</name>
<value>DESKTOPCLIENT</value>
</element>
<element>
<name>LOCALE</name>
<value>en-US</value>
</element>
<element>
<name>ENCODE</name>
<value></value>
</element>
</root>
If you need the XML tags to match the name fields like in your question, you would then need to transform the infoset. XSLT can do this kind of transformation without much trouble.
Edit: There seems to be an issue where IBM DFDL does not like the above solution. I'm not sure why, but it works with Apache Daffodil. Something about value being the empty string causes an issue. After some trial and error, I've found that IBM DFDL (and Apache Daffodil too) are okay with it if you specify that empty value elements should be treated as nil. So changing the value element to this works:
<xs:element name="value" type="xs:string" nillable="true"
dfdl:nilKind="literalValue" dfdl:nilValue="%ES;"
dfdl:useNilForDefault="no"/>
In that case, the infoset ends up with something like this:
<element>
<name>ENCODE</name>
<value xsi:nil="true"></value>
</element>
Edit2: The nillable properties are required because otherwise IBM DFDL treats an empty string value as absent rather than having an empty value. Being absent results in the error. Newer versions of the DFDL spec add a new property, emptyElementParsePolicy, which lets you control whether or not empty strings are treated as absent or are just treated as an empty string. Daffodil implements this property as an extensions, but defaults to the treat as empty behavior. IBM DFDL has the treat as absent behavior. Daffodil has a similar behavior to IBM DFDL when setting this property to treat as absent.

XSD with no order and selective restriction

I need to validate an XML that contains element in random order and some of them must exist and some of them only once. BTW some elements can be nested recursively.
For example there is a room that should contain one door and any number of boxes and elements. Boxes Can contain other boxes or/and elements.
Example XML:
<Room>
<Element />
<Box>
<Box>
<Element />
<Box></Box>
<Element />
</Box>
<Element />
</Box>
<Door />
<Element />
</Room>
This example is very simple, but in my case there are a lot of elements that can be in <Room>. Recursion is not a problem. The problem is to make <Door> to be required and in any order with siblings that are not required.
UPD: the question is about XSD 1.0 because I use .NET and there are no free lib for XSD 1.1
From what i'm reading i think you might need to use schema (XSD) indicators.
Check following link for more information: Schema indicators
Random Order
From your question:
I need to validate an XML that contains element in random order
Possible answer:
Using the All indicator (see link) you can specify that the element are in random order.
All Indicator
The indicator specifies that the child elements can appear in any order, and that each child element must occur only once:
Occurances
From your question:
some of them must exist and some of then once
Possible answer:
If i'm understanding it correctly you want to specify the amount of times an element exist or can be used. This is called occurance and again can be found back on the following link. You'll have to determine minOccurs and maxOccurs following your requirements.
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
The "maxOccurs" indicator specifies the maximum number of times an element can occur:
The "minOccurs" indicator specifies the minimum number of times an element can occur:
Everything including examples can be found back on the XSD/Schema indicators.
How your XSD (xml schema) will probably look like:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Room" type="Room_T"/>
<xs:complexType name="Room_T">
<xs:all>
<xs:element name="Element" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="Box" type="Box_T" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="Door" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:all>
</xs:complexType>
<xs:complexType name="Box_T">
<xs:all>
<xs:element name="Element" type="xs:string" minOccurs="1" maxOccurs="unbounded"/>
<xs:element name="Box" type="Box_T" minOccurs="0" maxOccurs="1"/>
</xs:all>
</xs:complexType>
</xs:schema>
I didn't check if the code above is valid but i think it could definitely get you started!

Why does the validation of keyref depend on the ordering of the key element?

My document contains A elements with IDs and B Elements which reference the As, like this:
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:\\\refissue.xsd">
<A id="x"/>
<A id="y"/>
<B><Aref idref="x" /></B>
</root>
When I validate against my simple schema (see below) I get the following error:
cvc-identity-constraint.4.3: Key 'ref' with value 'x' not found for identity constraint of element 'root'.
If I change the ordering of the A element to
<A id="y"/>
<A id="x"/>
the document validates without any errors.
Why does the validation result depend on the ordering of the elements?
Is this a bug in the validator or in my schema?
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="A">
<xs:complexType>
<xs:attribute name="id" type="xs:ID" />
</xs:complexType>
<xs:key name="A.KEY">
<xs:selector xpath="." />
<xs:field xpath="#id" />
</xs:key>
</xs:element>
<xs:element maxOccurs="unbounded" name="B">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="1" name="Aref">
<xs:complexType>
<xs:attribute name="idref" type="xs:IDREF" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:keyref name="ref" refer="A.KEY">
<xs:selector xpath="B/Aref" />
<xs:field xpath="#idref" />
</xs:keyref>
</xs:element>
</xs:schema>
I tried the validation with Eclipse (which uses xerces, I think), xerces-c 3.1.1, xmlstarlet 1.5.0 and libxml2 2.7.8 and I get the error only with eclipse and xerces.
You're right, validity against an identity constraint should not depend on the order of elements in the input.
Here I think the problem is that the schema is not quite right, and Xerces is having trouble generating a useful diagnosis of the problem. (The fact that libxml doesn't report an error is just a consequence of its incomplete coverage of XSD.)
Your key constraint should be defined on the scope of the element within which the key values need to be unique -- so on the root element, not on the A element. (As defined, your A.KEY constraint requires that the string value of each A element be unique within that A element, which will always be the case. The fact that the id attribute is declared as being of type xs:ID does require uniqueness, of course. And similarly, the fact that the Aref idref attribute is declared as being of type xs:IDREF means that your key and keyref declarations are not actually doing much work here that's not already being done by ID and IDREF.)
Once you move the declaration of A.KEY to the declaration of the root element, Xerces and Saxon agree that the schema is OK and the document is valid.
I had a similar problem in Eclipse until the xs:key and the xs:keyref were both explicitly set to the same type. In my case I set to both to xs:string(I also was using xs:unique and a keyref reference to the unique but it seems to work the same way for key and keyref pairs).
So for example if the key is based on an element that looks like this:
<xs:complexType name="elementTypeWithKey'>
<xs:attribute name="theKey" type="xs:string"/>
</xs:complexType>
and the theKey attribute is explicitly xs:string, make sure that the attribute used as a keyRef is also explicitly xs:string:
<xs:complexType name="elementTypeWithKeyRef">
<xs:attribute name="theKeyRef" type="xs:string"/>
</xs:complexType>

xml schema: restricting occurrences to sibling element sequence cardinality

Given:
<xs:complexType name="SymbolsList" final="">
<xs:sequence>
<xs:element name="symbol" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="ComboList">
<xs:sequence>
<xs:element name="combo" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="symbol" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="comboName" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:element name="symbolsList" type="SymbolsList">
<xs:unique name="uniqueSymbol">
<xs:selector xpath="./symbol" />
<xs:field xpath="#name" />
</xs:unique>
</xs:element>
<xs:element name="combosList" type="ComboList">
<xs:unique name="uniqueCombo">
<xs:selector xpath="./combo" />
<xs:field xpath="#comboName" />
</xs:unique>
</xs:element>
I believe this defines a list of symbols and a list of combinations of those symbols.
The each entry in the list of symbols must have a unique name, and each entry in the list of combos must have a unique comboName.
What I'd like to know is if there is a way for me to restrict the number of allowed occurrences in the combosList sequence to at least the number of symbols defined in the symbol list.
I guess I'm asking whether or not cardinality restriction can be variable and if so, how to associate it's limitation?
I also want to make it so that the comboList elements (a single combo) can only use names of symbols defined in the symbolList element.
I think I can pull of that last part. I can't find anything anywhere that talks about limiting caridinal sizes of disparate element sequences to greater than or equal to one or the other.
Perhaps it's not possible.
XSD requires cardinality to constraints to be specified literally in the declaration; the kind of dynamic calculation you have in mind is not in XSD's design space.
In XSD 1.1 you can add an assertion to some common ancestor of SymbolsList and CombosList that requires
count(CombosList/combo) ge count(SymbolsList/symbol)
XSD 1.1 is supported by Saxon EE and by Xerces J (in the latter case you have to look for the 1.1 distribution, or did last I looked). (One caveat: Note that Xerces J does not support all of XPath 2.0 in assertions, and I haven't actually checked to see whether this assertion is covered by the minimal subset of XPath XSD requires of conforming 1.1 implementations. Investigate further before sinking a lot of time here.)

Schema Issue: Can define element type OR add element attribute, but not both. I want both!

I've inherited the task of creating a schema for some XML which already exists - and IMHO is not the best that could have been done. The section giving me problems is the element at the end of the 'scan-result' element.
The best I'm hoping for with regard to the data in the 'spectrum' element is to treat it as type="xs:string". I'll programatically divide up the numeric pairs that constitute the data in the string later. (Even though this step would not be needed had the data been properly structured in the first place.)
Here's a similar piece of XML data to what I have to work with...
<scan-result>
<spectrum-index>0</spectrum-index>
<scan-index>2</scan-index>
<time-stamp>5609</time-stamp>
<tic>55510</tic>
<start-mass>22.0</start-mass>
<stop-mass>71.0</stop-mass>
<spectrum count="5">30,11352;31,360;32,16634;45,1161;46,26003</spectrum>
</scan-result>
The problem is, I can't seem to get a working definition for the 'spectrum' element that has the 'count' attribute and allows me to define the 'spectrum' element type as "xs:string".
What I would like is something like the following:
<xs:complexType name="ctypScanResult">
<xs:sequence>
<xs:element name="spectrum-index" type="xs:integer"/>
<xs:element name="scan-index" type="xs:integer"/>
<xs:element name="time-stamp" type="xs:integer"/>
<xs:element name="tic" type="xs:integer"/>
<xs:element name="start-mass" type="xs:float"/>
<xs:element name="stop-mass" type="xs:float"/>
<xs:element name="spectrum" type="xs:string">
<xs:complexType>
<xs:attribute name="count" type="xs:integer"/>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="count" type="xs:integer"/>
</xs:complexType>
The problem is that I can define the type of the 'spectrum' element as "xs:string" XOR I can define the anonymous 'xs:complexType' in the 'spectrum' element, which allows me to insert the 'count' attribute. But I need to be able to express both.
Given that I'm kind of stuck with the XML as it was handed to me, is there a schema definition that will allow me to describe this data?
Sorry this is long, but thanks to any and all who respond,
AlarmTripper
Followup: I know why the error occurs...
Quoted from W3C:
3.3.3 Constraints on XML Representations of Element Declarations
Schema Representation Constraint: Element Declaration Representation OK
In addition to the conditions imposed on element information items by the schema for schemas: all of the following must be true:
1 default and fixed must not both be present.
2 If the item's parent is not , then all of the following must be true:
2.1 One of ref or name must be present, but not both.
2.2 If ref is present, then all of , , , , , nillable, default, fixed, form, block and type must be absent, i.e. only minOccurs, maxOccurs, id are allowed in addition to ref, along with .
3 type and either or are mutually exclusive.
4 The corresponding particle and/or element declarations must satisfy the conditions set out in Constraints on Element Declaration Schema Components (§3.3.6) and Constraints on Particle Schema Components (§3.9.6).
But I'm still in the same fix I was before... How can I actually accomplish something that resembles my goal?
Thanks,
AlarmTripper
Let a tool do it for you! Try xsd.exe.
Or, if you must define by hand, at least check your hand-written-definition with an automatically generated one.
Here's what XSD.exe gave me for your input. I trimmed out some MS-NS cruft.
<xs:element name="spectrum">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="count" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
You need to set the attribute mixed="true" on complexType:
<xs:element name="spectrum">
<xs:complexType mixed="true">
<xs:attribute name="count" type="xs:integer" />
</xs:complexType>
</xs:element>
EDIT: Okay, just read your comment, sorry. I believe the following should work instead:
<xs:element name="spectrum">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="count" type="xs:integer" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="spectrum" type="xs:string">
<xs:complexType>
<!-- ADD THIS NEXT LINE -->
<xs:complexContent mixed="true"/>
<xs:attribute name="count" type="xs:integer"/>
</xs:complexType>
</xs:element>

Resources