Problem with xml schema elements hierarchy - xsd

What's wrong with this xml schema? It doesn't parse correctly, and I can't realize a hierarchy between cluster(element)->host(element)->Load(element).
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="cluster">
<xs:complexType>
<xs:sequence>
<xs:element ref="host"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="host">
<xs:complexType>
<xs:element ref="Load"/>
<xs:attribute name="name" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="Load">
<xs:complexType>
<xs:attribute name="usedPhisicalMemory" type="xs:integer"/>
</xs:complexType>
</xs:element>
</xs:schema>
Thank you, Emilio

To allow something like this (I corrected the typo in "usedPhysicalMemory"):
<cluster>
<host name="foo">
<Load usedPhysicalMemory="500" />
</host>
<host name="bar">
<Load usedPhysicalMemory="500" />
</host>
</cluster>
This schema would do it:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="cluster">
<xs:complexType>
<xs:sequence>
<xs:element ref="host" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="host">
<xs:complexType>
<xs:sequence>
<xs:element ref="Load" />
</xs:sequence>
<xs:attribute name="name" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
<xs:element name="Load">
<xs:complexType>
<xs:attribute name="usedPhysicalMemory" type="xs:integer" />
</xs:complexType>
</xs:element>
</xs:schema>
From the MSDN on <xs:complexType> (because the spec makes my brain hurt):
If group, sequence, choice, or all is specified, the elements must
appear in the following order:
group | sequence | choice | all
attribute | attributeGroup
anyAttribute
Maybe someone else can point out the relevant section in the spec.

In the host element, the load element cannot be a child of complexType, you must have a sequence, etc. in between.

Related

Using XSD in PySpark

I am building a datawarehouse in Azure Synapse where one of the sources are about 20 different types of XML files (with a different XSD scheme) and 1 base scheme.
What I am looking for is to get all XML elements and store them in files (1 per type) in my data lake. For that I need to have unique names per element, for example the whole path as a name. I tried to define dicts per type with all element names, but this is quite some work. To automate this (XSDs are updated yearly), I tried to code this out in Excel and VBA, but the XSDs are quite complex with nested complex types etc.
Below is a snippet of the baseschema.xsd:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="http://www.website.org/typ/1/baseschema/schema" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:iwmo="http://www.website.org/typ/1/baseschema/schema">
<xs:complexType name="Complex_Address">
...
<xs:sequence>
<xs:element name="Home" type="iwmo:Complex_House" minOccurs="0">
...
</xs:element>
<xs:element name="Postalcode" type="iwmo:Simple_Postalcode" minOccurs="0">
...
</xs:element>
<xs:element name="Streetname" type="iwmo:Simple_Streetname" minOccurs="0">
...
</xs:element>
<xs:element name="Areaname" type="iwmo:Simple_Areaname" minOccurs="0">
...
</xs:element>
<xs:element name="CountryCode" type="iwmo:Simple_CountryCode" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_House">
...
<xs:sequence>
<xs:element name="Housenumber" type="iwmo:Simple_Housenumber">
...
</xs:element>
<xs:element name="Houseletter" type="iwmo:Simple_Houseletter" minOccurs="0">
...
</xs:element>
<xs:element name="HousenumberAddition" type="iwmo:Simple_HousenumberAddition" minOccurs="0">
...
</xs:element>
<xs:element name="IndicationAddress" type="iwmo:Simple_IndicationAddress" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_MessageIdentification">
...
<xs:sequence>
<xs:element name="Identification" type="iwmo:Simple_IdentificationMessage">
...
</xs:element>
<xs:element name="Date" type="iwmo:Simple_Date">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_Product">
...
<xs:sequence>
<xs:element name="Categorie" type="iwmo:Simple_ProductCategory">
...
</xs:element>
<xs:element name="Code" type="iwmo:Simple_ProductCode" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Complex_XsdVersion">
<xs:sequence>
<xs:element name="BaseschemaXsdVersion" type="iwmo:Simple_Version">
</xs:element>
<xs:element name="MessageXsdVersion" type="iwmo:Simple_Version">
</xs:element>
</xs:sequence>
</xs:complexType>
And here a snippet of the xsd of 1 of the message types:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:typ="http://www.website.org/typ/1/baseschema/schema" xmlns:type1="http://www.website.org/typ/1/type1/schema" targetNamespace="http://www.website.org/typ/1/type1/schema" elementFormDefault="qualified">
<xs:import namespace="http://www.website.org/typ/1/baseschema/schema" schemaLocation="baseschema.xsd"></xs:import>
<xs:element name="Message" type="type1:Root"></xs:element>
<xs:complexType name="Root">
...
<xs:sequence>
<xs:element name="Header" type="type1:Header"></xs:element>
<xs:element name="Client" type="type1:Client"></xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Header">
<xs:sequence>
<xs:element name="Person" type="typ:Simple_SpecialCode">
...
</xs:element>
<xs:element name="MessageIdentification" type="typ:Complex_MessageIdentification">
...
</xs:element>
<xs:element name="XsdVersion" type="typ:Complex_XsdVersion">
...
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Client">
...
<xs:sequence>
<xs:element name="AssignedProducts" type="type1:AssignedProducts"></xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="AssignedProducts">
<xs:sequence>
<xs:element name="AssignedProduct" type="type1:AssignedProduct" maxOccurs="unbounded"></xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="AssignedProduct">
...
<xs:sequence>
<xs:element name="ToewijzingNummer" type="typ:Simple_Nummer">
...
</xs:element>
<xs:element name="Product" type="typ:Complex_Product" minOccurs="0">
...
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:schema>
Then this would be the desired output:
Header_Person
Header_MessageIdentification_Identification
Header_MessageIdentification_Date
Header_XsdVersion_BaseschemaXsdVersion
Header_XsdVersion_MessageXsdVersion
Client_AssignedProduct_ToewijzingNummer
Client_AssignedProduct_Product_Category
Client_AssignedProduct_Product_Code
In the baseschema I also added a nested complex type, to show the complexity.
Is there some kind of package or something in Python that can help me achieve this? Also a tool that can just create this list of elements in a text file would be great, I then can easily copy that into a variable.
I'm not sure if I'm clear about my requirements, if this is posted in the correct group with the correct tags, but I hope someone can point me into a good solution.
Ronald
I found a workaround after all where I put all fields from the xsds in variables. It's not ideal, but any other way would be too complex.

Using x:anyType instead of xsi:type gives Jaxb marshalling errors

I want to avoid xsi:type in an element and add the child elements at runtime based on some condition
I have option element defined as follows of type xs:anyType
<xs:complexType name="prod">
<xs:sequence>
<xs:element type="xs:anyType" name="option" minOccurs="0" maxOccurs="1"</xs:element>
</xs:sequence>
</xs:complexType>
used in element mapping as follows
<xs:element name="mappings">
<xs:complexType>
<xs:sequence>
<xs:element type="prod" name="productionSystem" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
I have need to add following option type as one of the following
<xs:complexType name="Text">
<xs:sequence>
<xs:element type="xs:short" name="id" />
<xs:element type="xs:string" name="name" />
</xs:sequence>
</xs:complexType>
<xs:complexType name="Value">
<xs:sequence>
<xs:element type="xs:short" name="id" />
<xs:element type="xs:string" name="name" />
<xs:element name="psValue" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="value" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
so basically one element option can be either of type Text or Value.
I want to avoid xsi:type in the xml. hence defined as xs:anyType. However at runt time Jaxb Marshalling fails with error
"any of its super class is known to this context". How to ensure Text and Value are in Jaxb Context.
Can someone pls guide on same.
Thanks,
Anjana

xsd error in the following

here is the xml
<?xml version="1.0" encoding="utf-8"?>
<Modules xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="XSDQu3.xsd">
<Module code="CSE1246">
<Name shortName="ADSA">Applied Data Structures and Algorithms</Name>
<Level>1</Level>
<ResourcePerson>
<Name>Anwar</Name>
<Surname>Chutoo</Surname>
</ResourcePerson>
</Module>
<Module code="CSE2041">
<Name shortName="Web 2">Web Technologies II</Name>
<Level>2</Level>
<ResourcePerson>
<FullName>Shehzad Jaunbuccus</FullName>
</ResourcePerson>
</Module>
</Modules>
i'm having an error at name. a resource person can either contain fullname or name and surname. please help. Am i correctly doing this part
here is the xsd
<?xml version="1.0" encoding="utf-8"?>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="NameNSurnameType">
<xs:sequence>
<xs:element name="Name" type="xs:string"/>
<xs:element name="Surname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="ResourcePersonType">
<xs:sequence>
<xs:choice>
<xs:element name="NameNSurnameType" type="NameNSurnameType"/>
<xs:element name="FullName" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
<xs:attribute name="code">
<xs:simpleType>
<xs:restriction base="xs:ID">
<xs:pattern value="CSE(\d{4})"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
<xs:complexType name="nameType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="shortName" type="xs:string"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Modules">
<xs:complexType>
<xs:sequence>
<xs:element name="Module" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="nameType"/>
<xs:element name="Level" type="xs:positiveInteger"/>
<xs:element name="ResourcePerson" type="ResourcePersonType"/>
</xs:sequence>
<xs:attribute ref="code" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
I can see your confusion; a choice "option" can be almost any particle; a compositor (such as xs:sequence or another xs:choice) would match the description. The minimum you need to do is change the following definition.
<xs:complexType name="ResourcePersonType">
<xs:sequence>
<xs:choice>
<xs:sequence>
<xs:element name="Name" type="xs:string"/>
<xs:element name="Surname" type="xs:string"/>
</xs:sequence>
<xs:element name="FullName" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
If you want to reference the name as you seem to want to do it with the global complex type, then one can create a group and reference that instead. Below is a modified XSD consistent with the above scenario:
<?xml version="1.0" encoding="utf-8" ?>
<!-- XML Schema generated by QTAssistant/XSD Module (http://www.paschidev.com) -->
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:group name="NameNSurnameType">
<xs:sequence>
<xs:element name="Name" type="xs:string"/>
<xs:element name="Surname" type="xs:string"/>
</xs:sequence>
</xs:group>
<xs:complexType name="ResourcePersonType">
<xs:sequence>
<xs:choice>
<xs:group ref="NameNSurnameType"/>
<xs:element name="FullName" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
<xs:attribute name="code">
<xs:simpleType>
<xs:restriction base="xs:ID">
<xs:pattern value="CSE(\d{4})"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
<xs:complexType name="nameType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="shortName" type="xs:string"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Modules">
<xs:complexType>
<xs:sequence>
<xs:element name="Module" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="nameType"/>
<xs:element name="Level" type="xs:positiveInteger"/>
<xs:element name="ResourcePerson" type="ResourcePersonType"/>
</xs:sequence>
<xs:attribute ref="code" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

XSD schema - Either one or both

I it possible to make a choice scenario, like (A or B or Both). If yes, how can this be done with the following elements?
<xs:element name="a" type="typeA" />
<xs:element name="b" type="typeB" />
Hope you can help.
Regards,
Nima
You can see XSD "one or both" choice construct leads to ambiguous content model
<xs:schema xmlns:xs="...">
<xs:element name="a" type="typeA" />
<xs:element name="b" type="typeB" />
<xs:element name="...">
<xs:complexType>
<xs:sequence>
<xs:choice>
<xs:sequence>
<xs:element ref="a"/>
<xs:element ref="b" minOccurs="0"/>
</xs:sequence>
<xs:element ref="b"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

XML schema for elements with same name but different sub-structure depending on context

I try to define a schema for XML documents I receive.
The documents look like:
<root>
<items>
<group name="G-1">
<item name="I-1"/>
<item name="I-2"/>
<item name="I-3"/>
<item name="I-4"/>
</group>
</items>
<data>
<group name="G-1" place="here">
<customer name="C-1">
<item name="I-1" count="3"/>
<item name="I-2" count="4"/>
</customer>
<customer name="C-2">
<item name="I-3" count="7"/>
</customer>
</group>
</data>
</root>
I tried XmlSpy and xsd.exe from .NET 2.0. Both created schema definitions which allow below <group> any number of <item> and <customer> elements. But what I'm looking for should restrict <group> below <items> to <item> elements, and <group> below <data> to <customer> elements.
Is this something xml schema is not capable at all?
The key points (see XML Schema Runtime Polymorphism via xsi:type and Abstract Types for complete and correct context/placement/usage) are:
Create a base type with (abstract="true" to prevent it from being used directly)
Note: the ref attribute replaces the name attribute for elements defined elsewhere
<xs:complexType name="CustomerType" abstract="true" >
<xs:sequence>
<xs:element ref="cust:FirstName" />
<xs:element ref="cust:LastName" />
<xs:element ref="cust:PhoneNumber" minOccurs="0"/>
</xs:sequence>
<xs:attribute name="customerID" type="xs:integer" />
</xs:complexType>
Create two or more derived types by extending or restricting the base type
<xs:complexType name="MandatoryPhoneCustomerType" >
<xs:complexContent>
<xs:restriction base="cust:CustomerType">
<xs:sequence>
<xs:element ref="cust:FirstName" />
<xs:element ref="cust:LastName" />
<xs:element ref="cust:PhoneNumber" minOccurs="1" />
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
and
<xs:complexType name="AddressableCustomerType" >
<xs:complexContent>
<xs:extension base="cust:CustomerType">
<xs:sequence>
<xs:element ref="cust:Address" />
<xs:element ref="cust:City" />
<xs:element ref="cust:State" />
<xs:element ref="cust:Zip" />
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Reference the base type in an element
<xs:element name="Customer" type="cust:CustomerType" />
In your instance XML document, specify the specific derived type as an xsi:type attribute
<cust:Customer customerID="12345" xsi:type="cust:MandatoryPhoneCustomerType" >
<cust:FirstName>Dare</cust:FirstName>
<cust:LastName>Obasanjo</cust:LastName>
<cust:PhoneNumber>425-555-1234</cust:PhoneNumber>
</cust:Customer>
or:
<cust:Customer customerID="67890" xsi:type="cust:AddressableCustomerType" >
<cust:FirstName>John</cust:FirstName>
<cust:LastName>Smith</cust:LastName>
<cust:Address>2001</cust:Address>
<cust:City>Redmond</cust:City>
<cust:State>WA</cust:State>
<cust:Zip>98052</cust:Zip>
</cust:Customer>
Yes, XSD can handle this. I generated this schema from Visual Studio 2008 (much faster than doing it by hand) and it will do what you're looking for:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element name="items">
<xs:complexType>
<xs:sequence>
<xs:element name="group">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="item">
<xs:complexType>
<xs:attribute name="name" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="name" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="data">
<xs:complexType>
<xs:sequence>
<xs:element name="group">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="customer">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="item">
<xs:complexType>
<xs:attribute name="name" type="xs:string" use="required" />
<xs:attribute name="count" type="xs:unsignedByte" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="name" type="xs:string" use="optional" />
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="name" type="xs:string" use="required" />
<xs:attribute name="place" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Resources