XSD: Divide scheme using a choice of sequences - xsd

A part of my xsd looks as follows:
<xs:element name="my_element" minOccurs="1 maxOccurs="unbounded">
<xs:complexType>
<xs:choice>
<xs:sequence>
<xs:element name="sequence_1" type="xs:string"/>
<xs:element name="ID1" type="xs:string"/>
<xs:element name="TYPE1" type="xs:string"/>
</xs:sequence>
<xs:sequence>
<xs:element name="sequence_2" type="xs:string"/>
<xs:element name="ID2" type="xs:string"/>
<xs:element name="TYPE2" type="xs:string"/>
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:element>
The first element name of the sequence decides about th following nodes.
If I now have a lot of different sequences with some elements inside my xsd doesn't look very clear.
Is it possible to separate the sequences (like I can do it for complexType)?

You can use group :
<xs:group name="seqGroup_x">
<xs:sequence>
<xs:element name="sequence_x" type="xs:string"/>
<xs:element name="ID" type="xs:string"/>
...
</xs:sequence>
</xs:group>
<xs:complexType name="yourType">
<xs:group ref="seqGroup_x"/>
<xs:attribute name="anotherattr" type="xs:string"/>
</xs:complexType>

Related

Using XSD in PySpark

I am building a datawarehouse in Azure Synapse where one of the sources are about 20 different types of XML files (with a different XSD scheme) and 1 base scheme.
What I am looking for is to get all XML elements and store them in files (1 per type) in my data lake. For that I need to have unique names per element, for example the whole path as a name. I tried to define dicts per type with all element names, but this is quite some work. To automate this (XSDs are updated yearly), I tried to code this out in Excel and VBA, but the XSDs are quite complex with nested complex types etc.
Below is a snippet of the baseschema.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="http://www.website.org/typ/1/baseschema/schema" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:iwmo="http://www.website.org/typ/1/baseschema/schema">
<xs:complexType name="Complex_Address">
...
<xs:sequence>
<xs:element name="Home" type="iwmo:Complex_House" minOccurs="0">
...
</xs:element>
<xs:element name="Postalcode" type="iwmo:Simple_Postalcode" minOccurs="0">
...
</xs:element>
<xs:element name="Streetname" type="iwmo:Simple_Streetname" minOccurs="0">
...
</xs:element>
<xs:element name="Areaname" type="iwmo:Simple_Areaname" minOccurs="0">
...
</xs:element>
<xs:element name="CountryCode" type="iwmo:Simple_CountryCode" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_House">
...
<xs:sequence>
<xs:element name="Housenumber" type="iwmo:Simple_Housenumber">
...
</xs:element>
<xs:element name="Houseletter" type="iwmo:Simple_Houseletter" minOccurs="0">
...
</xs:element>
<xs:element name="HousenumberAddition" type="iwmo:Simple_HousenumberAddition" minOccurs="0">
...
</xs:element>
<xs:element name="IndicationAddress" type="iwmo:Simple_IndicationAddress" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_MessageIdentification">
...
<xs:sequence>
<xs:element name="Identification" type="iwmo:Simple_IdentificationMessage">
...
</xs:element>
<xs:element name="Date" type="iwmo:Simple_Date">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_Product">
...
<xs:sequence>
<xs:element name="Categorie" type="iwmo:Simple_ProductCategory">
...
</xs:element>
<xs:element name="Code" type="iwmo:Simple_ProductCode" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_XsdVersion">
<xs:sequence>
<xs:element name="BaseschemaXsdVersion" type="iwmo:Simple_Version">
</xs:element>
<xs:element name="MessageXsdVersion" type="iwmo:Simple_Version">
</xs:element>
</xs:sequence>
</xs:complexType>
And here a snippet of the xsd of 1 of the message types:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:typ="http://www.website.org/typ/1/baseschema/schema" xmlns:type1="http://www.website.org/typ/1/type1/schema" targetNamespace="http://www.website.org/typ/1/type1/schema" elementFormDefault="qualified">
<xs:import namespace="http://www.website.org/typ/1/baseschema/schema" schemaLocation="baseschema.xsd"></xs:import>
<xs:element name="Message" type="type1:Root"></xs:element>
<xs:complexType name="Root">
...
<xs:sequence>
<xs:element name="Header" type="type1:Header"></xs:element>
<xs:element name="Client" type="type1:Client"></xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Header">
<xs:sequence>
<xs:element name="Person" type="typ:Simple_SpecialCode">
...
</xs:element>
<xs:element name="MessageIdentification" type="typ:Complex_MessageIdentification">
...
</xs:element>
<xs:element name="XsdVersion" type="typ:Complex_XsdVersion">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Client">
...
<xs:sequence>
<xs:element name="AssignedProducts" type="type1:AssignedProducts"></xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="AssignedProducts">
<xs:sequence>
<xs:element name="AssignedProduct" type="type1:AssignedProduct" maxOccurs="unbounded"></xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="AssignedProduct">
...
<xs:sequence>
<xs:element name="ToewijzingNummer" type="typ:Simple_Nummer">
...
</xs:element>
<xs:element name="Product" type="typ:Complex_Product" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:schema>
Then this would be the desired output:
Header_Person
Header_MessageIdentification_Identification
Header_MessageIdentification_Date
Header_XsdVersion_BaseschemaXsdVersion
Header_XsdVersion_MessageXsdVersion
Client_AssignedProduct_ToewijzingNummer
Client_AssignedProduct_Product_Category
Client_AssignedProduct_Product_Code
In the baseschema I also added a nested complex type, to show the complexity.
Is there some kind of package or something in Python that can help me achieve this? Also a tool that can just create this list of elements in a text file would be great, I then can easily copy that into a variable.
I'm not sure if I'm clear about my requirements, if this is posted in the correct group with the correct tags, but I hope someone can point me into a good solution.
Ronald
I found a workaround after all where I put all fields from the xsds in variables. It's not ideal, but any other way would be too complex.

XSD Required Elements with specific child elements (Multiple Definitions with different types)

All, I have an XML doc which I don't control for which I need to create an xsd to validate. The XML doc has multiple transaction types, some of which are required a specific number of times, and some aren't. the parent element is simply <transaction>, the child element can be either a <ControlTransaction> or a <RetailTransaction>. The issue is that I need to require a <transaction> to exists with a <ControlTransaction> with a <ReasonCode> element having a value of "Register Open" and another with a value of "Register Close" as follows:
<?xml version="1.0" encoding="UTF-8"?>
<RegisterDay xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:cp="urn:register">
<Transaction>
<SequenceNumber>1</SequenceNumber>
<ControlTransaction>
<ReasonCode>Register Open</ReasonCode>
</ControlTransaction>
</Transaction>
<Transaction>
<SequenceNumber>2</SequenceNumber>
<RetailTransaction>
...stuff..
<Total>9.99</Total>
</RetailTransaction>
</Transaction>
<Transaction>
<SequenceNumber>3</SequenceNumber>
<ControlTransaction>
<ReasonCode>Register Close</ReasonCode>
</ControlTransaction>
</Transaction>
</RegisterDay>
My best attempt is to use types in my schema, but get "Elements with the same name and same scope must have the same type". I don't know how to get around this.
<?xml version="1.0"?>
<xs:schema
xmlns:cp="urn:register"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
attributeFormDefault="unqualified"
elementFormDefault="qualified">
<xs:element name="RegisterDay">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="1" name="Transaction" type="TransactionRegisterOpen_type"/>
<xs:element minOccurs="1" maxOccurs="unbounded" name="Transaction" type="RetailTransaction_type"/>
<xs:element minOccurs="1" maxOccurs="1" name="Transaction" type="TransactionRegisterClose_type"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:simpleType name="RegisterOpen_type">
<xs:restriction base="xs:string">
<xs:pattern value="Register Open"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="RegisterClose_type">
<xs:restriction base="xs:string">
<xs:pattern value="Register Close"/>
</xs:restriction>
</xs:simpleType>
<xs:complexType name="TransactionRegisterOpen_type">
<xs:sequence>
<xs:element name="SequenceNumber" type="xs:unsignedShort"/>
<xs:element name="ControlTransaction">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" name="ReasonCode" type="RegisterOpen_type"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="TransactionRegisterClose_type">
<xs:sequence>
<xs:element name="SequenceNumber" type="xs:unsignedShort"/>
<xs:element name="ControlTransaction">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" name="ReasonCode" type="RegisterClose_type"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="RetailTransaction_type">
<xs:sequence>
<xs:element name="SequenceNumber" type="xs:unsignedShort"/>
<xs:element name="ControlTransaction">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" name="Total" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:schema>
Has anyone run into this and/or have any suggestions? I'm pretty much stumped.
Perhaps with enumeration ?
<?xml version="1.0"?>
<xs:schema
xmlns:cp="urn:register"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
attributeFormDefault="unqualified"
elementFormDefault="qualified"
targetNamespace="urn:register">
<xs:element name="RegisterDay">
<xs:complexType>
<xs:sequence>
<xs:element
minOccurs="1"
maxOccurs="unbounded"
name="Transaction"
type="cp:TypeTransaction"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="TypeTransaction">
<xs:sequence>
<xs:element name="SequenceNumber" type="xs:unsignedShort"/>
<xs:choice>
<xs:element name="RetailTransaction"/>
<xs:element name="ControlTransaction">
<xs:complexType>
<xs:sequence>
<xs:element name="ReasonCode">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Register Open"/>
<xs:enumeration value="Register Close"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:sequence>
</xs:complexType>
</xs:schema>

How do you make an XSD element required or not required depending on the context?

We have a definition of Person element where we want different elements to be required depending on
what they are doing. For example, if they are adding a Person, then different elements are required
to be sent versus updating a Person. Below in the example, the Person type is currently duplicated, which
of course is wrong. Is there a good way of representing this in the xsd so we can reuse the Person type.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="Person">
<xs:annotation>
<xs:documentation>This is the definition when changing a person</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:element name="PartyName" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="GenderCode" type="GenderCode_Type" minOccurs="0" maxOccurs="1"/>
<xs:element name="BirthDate" type="xs:date" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Person">
<xs:annotation>
<xs:documentation>This is the definition when adding a person</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:element name="PartyName" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="GenderCode" type="GenderCode_Type" minOccurs="1" maxOccurs="1"/>
<xs:element name="BirthDate" type="xs:date" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
The simplest way to have two different types for Person elements is to use local declarations of Person in the two different contexts you have in mind. For example, you might say:
<xs:element name="Add">
<xs:complexType>
<xs:sequence>
<xs:element name="Person" type="AddPerson"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Update">
<xs:complexType>
<xs:sequence>
<xs:element name="Person" type="ChangePerson"/>
</xs:sequence>
</xs:complexType>
</xs:element>
This example assumes that you have redefined your two complex types as AddPerson and ChangePerson.
If additionally you want to have the two complex types be explicitly related, you can derive them both by restriction from a generic Person type.
<xs:complexType name="Person">
<xs:annotation>
<xs:documentation>This is the generic
definition for persons</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:element name="PartyName" type="xs:string"
minOccurs="0" maxOccurs="1"/>
<xs:element name="GenderCode" type="GenderCode_Type"
minOccurs="0" maxOccurs="1"/>
<xs:element name="BirthDate" type="xs:date"
minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="ChangePerson">
<xs:annotation>
<xs:documentation>This is the definition
when changing a person</xs:documentation>
</xs:annotation>
<xs:complexContent>
<xs:restriction base="Person">
<xs:sequence>
<xs:element name="PartyName" type="xs:string"
minOccurs="0" maxOccurs="1"/>
<xs:element name="GenderCode" type="GenderCode_Type"
minOccurs="0" maxOccurs="1"/>
<xs:element name="BirthDate" type="xs:date"
minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="AddPerson">
<xs:annotation>
<xs:documentation>This is the definition
when adding a person</xs:documentation>
</xs:annotation>
<xs:complexContent>
<xs:restriction base="Person">
<xs:sequence>
<xs:element name="PartyName" type="xs:string"
minOccurs="1" maxOccurs="1"/>
<xs:element name="GenderCode" type="GenderCode_Type"
minOccurs="1" maxOccurs="1"/>
<xs:element name="BirthDate" type="xs:date"
minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
Here, the generic type Person is identical to the AddPerson type; I've defined AddPerson using a vacuous restriction just for the symmetry of deriving both of the operation-specific types from the generic type.
Whether having such an explicit relation among your types actually helps you achieve your goals will depend, of course, in part on what use your system makes of your schema type definitions.

Difference between group and sequence in XML Schema?

What is the difference between an xs:group and an xs:sequence in XML Schema? When would you use one or the other?
xs:sequence - together with xs:choice and xs:all - is used to define the valid sequences of XML element in the target XML. E.g. the schema for this XML:
<mainElement>
<firstSubElement/>
<subElementA/>
<subElementB/>
</mainElement>
is something like:
<xs:element name='mainElement'>
<xs:complexType>
<xs:sequence>
<xs:element name="firstSubElement"/>
<xs:element name="subElementA"/>
<xs:element name="subElementB"/>
</xs:sequence>
</xs:complexType>
</xs:element>
xs:group is used to define a named group of XML element following certain rules that can then be referenced in different parts of the schema. For example if the XML is:
<root>
<mainElementA>
<firstSubElement/>
<subElementA/>
<subElementB/>
</mainElementA>
<mainElementB>
<otherSubElement/>
<subElementA/>
<subElementB/>
</mainElementB>
</root>
you can define a group for the common sub-elements:
<xs:group name="subElements">
<xs:sequence>
<xs:element name="subElementA"/>
<xs:element name="subElementB"/>
</xs:sequence>
</xs:group>
and then use it:
<xs:element name="mainElementA">
<xs:complexType>
<xs:sequence>
<xs:element name="firstSubElement"/>
<xs:group ref="subElements"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="mainElementB">
<xs:complexType>
<xs:sequence>
<xs:element name="otherSubElement"/>
<xs:group ref="subElements"/>
</xs:sequence>
</xs:complexType>
</xs:element>

XSD schema - Either one or both

I it possible to make a choice scenario, like (A or B or Both). If yes, how can this be done with the following elements?
<xs:element name="a" type="typeA" />
<xs:element name="b" type="typeB" />
Hope you can help.
Regards,
Nima
You can see XSD "one or both" choice construct leads to ambiguous content model
<xs:schema xmlns:xs="...">
<xs:element name="a" type="typeA" />
<xs:element name="b" type="typeB" />
<xs:element name="...">
<xs:complexType>
<xs:sequence>
<xs:choice>
<xs:sequence>
<xs:element ref="a"/>
<xs:element ref="b" minOccurs="0"/>
</xs:sequence>
<xs:element ref="b"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Resources