XML Schema within a schema - xsd

The question in short - "can I define a schema within a schema which can be validated as a whole?
Explanation:
Is it possible to define a schema for the following XML. I need to define a schema for "customer". The "customertype" child element itself is a schema. Within the customertype I should have an element called "Source" which is mandatory.
<customer>
<customername>acustomer</customername>
<customertype>
<xs:schema>
<xs:element name="profession">
<xs:complexType>
<xs:sequence>
<xs:element name="Source" type="xs:int" />
<xs:element name="ProfessionName" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:schema>
</customertype>
</customer>
Is it possible to Define the schema for this xml so that all the requirements are met?

As Mimo has pointed out, there's no problem defining customertype (or another other element) as containing elements in the XSD namespace, which is what you appear to be asking about.
But if your goal is to be able to validate customer elements (or profession elements, which is what the schema in your example declares), it's hard to imagine a validation architecture in which that's the best way (or even a workable way) to go about it. One reason is that validating a document instance against schema information provided by the instance being validated doesn't produce the same confidence in the data's cleanliness as validating it against a known schema. (Put yourself in the shoes of an adversary seeking to subvert your validation and persuade your system to accept bogus data as valid. If the adversary gets to specify what counts as a valid document instance, how useful is it to know that the document is valid?)
What is it that prevents you from writing a schema and using it in the usual way?
[Addition, 15 October 2012, after OP's comment]
If I've understood your comment of earlier today correctly, your requirement is to allow people other than you to specify the type of the customer element however they like, subject to the proviso that that type must contain a child element named Source, whose type will be xsd:int. You don't specify whether you need access to the type definition they are using or not, so I'll try to consider both the case where you do need it and the case where you don't need it.
Three ways to make this situation work are described below. They have in common that there is
a 'main' schema document that defines a basic version of the schema, and
one or more 'auxiliary' schema documents for use in different situations.
In general, you may find it helpful to find a good textbook on XSD and review what it says about creating a schema from declarations in several schema documents.
(1) One approach uses xsi:type. You define a main schema document in which the customer element has a named type; I'll assume the type is named Customer. The Customer type accepts any element whose first child element is named Source. For example:
<xs:element name="customer" type="Customer"/>
<xs:complexType name="Customer">
<xs:sequence>
<xs:element name="Source" type="xs:int"/>
<xs:any minOccurs="0"
maxOccurs="unbounded"
processContents="lax"/>
</xs:sequence>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
Those who want a more specific type for the customer element (I'll call them the 'users') provide auxiliary schema documents for your target namespace in which they declare other complex types which restrict Customer. For example, they might want the customer element to contain elements called name, address, and phone number:
<xs:complexType name="Customer-for-us">
<xs:complexContent>
<xs:restriction base="Customer">
<xs:sequence>
<xs:element name="Source" type="xs:int"/>
<xs:element ref="name"/>
<xs:element ref="address"/>
<xs:element ref="phone"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
This is a legal restriction of the Customer type, so the customer element can use it. A document instance might therefore contain an element like:
<customer xsi:type="Customer-for-us">
<Source>83760273</Source>
<name>Willy Wonka</name>
<address> ... </address>
<phone> ... </phone>
</customer>
The document is validated against the schema constructed from their auxiliary schema document, together with the main schema document, so the definition of type Customer-for-us is enforced in the usual way.
By using wildcards and lax validation, the Customer type ensures that users can do anything they like in their version of the type, as long as the first child is named Source and has type int.
(2) A second approach uses a hole in the main schema document.
You write a main schema document as before, including the declaration of the customer element as having type Customer. But the main schema document does not contain a declaration for that type. Instead, you declare the Customer type in an auxiliary schema document, which is combined with the main one at validation time in the usual way (I'd recommend you have a third schema document which serves as a driver and includes the other two, but there are many ways to make it work).
The users who want a more specific Customer type, meanwhile, write their own declaration for the Customer type, subject to the compatibility constraints about the first child named Source and so on. The users use their own driver file, which embeds the main schema document and their version of the auxiliary schema document with their own declaration of the Customer type.
This way, the xsi:type attribute does not need to be used.
(3) A third approach uses the xs:redefine or (in XSD 1.1) the xs:override facility.
You write the main schema document as described in solution (1). The users use xs:redefine or xs:override to redefine Customer as they wish. This answer is already rather long, so I do not propose to include a tutorial on the use of either redefine or override.

It is possible to create a schema importing and using another schema. This defines your customer element with customertype containing a schema:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://xml.netbeans.org/schema/Notes"
xmlns:tns="http://xml.netbeans.org/schema/Notes"
elementFormDefault="qualified">
<xsd:import namespace="http://www.w3.org/2001/XMLSchema"/>
<xsd:element name="customer">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="customername" type="xsd:string"/>
<xsd:element name="customertype">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="xsd:schema"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
The problem is that you have an additional condition on the customertype schema - so you should in theory get the standard XSD schema and modify it, but there are many different ways to satisfy that condition in a schema definition, so it is very tricky (and maybe impossible) to do this 'modification'
Probably a better approach is restrict the possible schemas used inside customertype (e.g. it must be a single element definition with complex type specified directly etc. etc) and write a sub-set of the XSD schema that describe this restricted schema definition.

Related

Making an xsd schema extensible with a typed element

Given a schema that defines an element of a certain type, is it possible to allow that type to be extended, but still have that extension element be strongly typed? In other words, add some kind of extension point that can be used from an external schema to add elements that can only be used in this location?
Let's say the schema looks kinda like:
<xs:schema …>
<xs:element name="Match" type="tns:TNodeConstraint" />
<xs:complexType name="TNodeConstraint">
<xs:group ref="tns:Expression" />
</xs:complexType>
<xs:group name="Expression">
<xs:choice>
<xs:element name="And">
<xs:complexType … />
</xs:element>
<xs:element name="Or">
<xs:complexType … />
</xs:element>
<xs:element name="IsAbstract">
<xs:element name="IsExtern">
<!-- Some kind of extension point? -->
</xs:choice>
</xs:group>
</xs>
Is it possible to extend the Expression group so that a second, external schema could say that I can accept IsMyCustomConstraint here, but not IsMyCustomSortOrder? So this will be valid:
<Match>
<IsAbstract />
<IsExtern />
<IsMyCustomConstraint />
</Match>
But this would be invalid?
<Match>
<IsAbstract />
<IsExtern />
<IsMyCustomSortOrder />
</Match>
I don't want to use xs:any as that would allow putting a "sort order" where a constraint can go.
I can modify the original schema
I'm in control of what the namespaces of IsMyCustomConstraint and IsMyCustomSortOrder would be, and it's not important if they match the original schema or not.
is it possible to allow that type to be extended, but still have that extension element be strongly typed?
Definitely - this is described in detail, with examples, here: https://www.w3.org/TR/2000/WD-xmlschema-0-20000225/#DerivExt. As far as I can tell, you just need to declare one or more complex types that are extensions of your base type 'TNodeConstraint'.
XML schema has a rich set of facilities to support type inheritance including:
abstract base types (base type must be extended or restricted before use)
extension (new type allows more values than the base type)
restriction (new type allows fewer values than the base type)
control of whether further extensions/restrictions are allowed (final/block attributes)
I don't see any need to use a separate XSD for the extensions, although you can if you want to. You may find it useful to know about xsi:type, abstract types and the block/final attributes - all are described the XML Schema Part 0 - Primer mentioned above.

JAXB: Ignore order of elements while using xs:extension

We are using JAXB for Java-xml binding.We initially created domain class and then using schemagen commandline tool the following schema has been generated. But generated schema is not valid, giving following error message.
Error Message:
cos-all-limited.1.2: An all model group must appear in a particle with {min occurs} = {max occurs} = 1, and that particle must be part of a pair which constitutes the {content type} of a complex type definition.
Use Case:
There are two classes Emp(Base class) and Dept(Child class).
1. There is no restriction on the elements sequence(means empId, deptId and deptName can appear in any order). so we used xs:all element
2. In Dept class, deptId field should appear only once(minOccurs =1, maxOccurs=1) and deptName is optional.
As per my usecase i am unable to generate valid schema. I did search on google. But i couldn't find the solution. So i am anticipating experts can answer this query. Could you please look into below classes,schema and guide me in the right direction.
NOTE: please don't suggest me to create some temporary domain classes.
Thanks in anticipation.
Emp.java
#XmlAccessorType(XmlAccessType.FIELD)
#XmlType(name="EmpType", propOrder={})
#XmlRootElement
public class Emp {
#XmlElement(name="empId", required = true)
private String empId;
}
Dept.java
#XmlAccessorType(XmlAccessType.FIELD)
#XmlType(name="DeptType", propOrder={})
public class Dept extends Emp
{
#XmlElement(name="deptId", required = true)
private String deptId;
private String deptName;
}
Schema1.xsd
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="emp" type="EmpType"/>
<xs:complexType name="EmpType">
<xs:sequence>
<xs:element name="empId" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="DeptType">
<xs:complexContent>
<xs:extension base="EmpType">
<xs:all> <!--showing error message, mentioned above -->
<xs:element name="deptId" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="deptName" type="xs:string" minOccurs="0"/>
</xs:all>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:schema>
The document structure that you are trying to allow is actually difficult to represent in an XML schema. You may not be able to generate it from a JAXB (JSR-222) annotated model. You do however have a couple of options:
Option #1 - Generate a Simpler XML Schema
If you are not validating XML content with your XML schema and are simply using it as documentation that people can use as a guide then I would drop the all sections and use sequence instead. This will work better with the inheritance relationship that you have. If you don't specify an instance of Schema on the Unmarshaller the order of elements is not enforced so you will be able to read all the XML documents that meet your rules.
Option #2 - Create Your Own XML Schema
If you want the XML schema to exactly reflect all the possible accepted inputs then you will need to write it yourself. You can reference this existing XML schema by using the package level #XmlSchema annotation.
#XmlSchema(location = "http://www.example.com/package/YourSchema.xsd")
package com.example;
import javax.xml.bind.annotation.XmlSchema;

Unmarshalling based on Concrete Instance

I am a new comer to JaxB World and I am facing one problem w.r.t. unmarshalling of the stored xml content into java class object. Problem description is as follows. Let me know if this is solvable
I have my xsd file which contains following content(this is just a example)
Student info
<xs:complexType name="specialization" abstract="true">
</xs:complexType>
<xs:complexType name="Engineering">
<xs:complexContent>
<xs:extension base="specialization">
<xs:sequence>
<xs:element name="percentage" type="xs:int" minOccurs="0"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="Medical">
<xs:complexContent>
<xs:extension base="specialization">
<xs:sequence>
<xs:element name="grade" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Now all the corresponding java classes are generated by compiling the xsd. Now lets assume in my application i will set the specialization attribute of Student info by constructing Engineering class instance. So after all the operation when i save
the xml file that get saved will have the entry like below
<Student>
<Name>Name1</Name>
<Specialization>
<percentage>78<percentage>
</Specialization>
</Student>
Now when the above content goes for unmarshalling, unmarshalling fails saying unexpected element . I guess this is b'cos Specialization element is of type specialization it calls unmarshalling on itself rather than derived object which is stored.
I hope my explanation is clear. Is there any way that we can unmarshall based on derived class instanse type. The xsd and bindings.xjb file is completely in my control so i can add or modify any entries/info which conveys to unmarshalling rules to unmarshall on derived class.
Thanks for your Suggestion but the it still not working for me.
Here is what I tried
Option #1 - xsi:type
My xsd looks same as what is explained in the example but still the Xsi:type doesn't come in the resulted xml. Do i need to add any other setting while compiling? Which JaxB version should i use for this?
Option#2 - Substitution Groups
When i added the substitution entry part in my xsd, XSD compilation failed saying duplicate names "Engineering" and "Medical". I guess element name and type Name being same compilation cribs(All engineering, Medical,specialization being same both in type definition and element Name)
I can't modify the generated classes as we are using Model driven Architecture. Only thing that is in hand is xsd. Any modification to the xsd is allowed. Ideally First option should have worked. But can't figure out why it is not working. Let me know if you have some suggestion to narrow down the problem.
There are different ways of representing Java inheritance in XML when using JAXB:
Option #1 - xsi:type
In this representation an attribute is used to indicate the subtype being used to populate this element.
<Student>
<Name>Name1</Name>
<Specialization xsi:type="Engineering">
<percentage>78<percentage>
</Specialization>
</Student>
For a detailed example see:
http://blog.bdoughan.com/2010/11/jaxb-and-inheritance-using-xsitype.htmlhtml
Option #2 - Substitution Groups
Here an element name is used to indicate the subtype. This corresponds to the schema concept of substitution groups and leverages JAXB's #XmlElementRef annotation:
<Student>
<Name>Name1</Name>
<Engineering>
<percentage>78<percentage>
</Engineering>
</Student>
For a detailed example see:
http://blog.bdoughan.com/2010/11/jaxb-and-inheritance-using-substitution.html

Creating a valid XSD that is open using <all> and <any> elements

I need to specify a XSD for validating XML documents. The XSD will be used for a JAXB generation of Java bindings.
My problem is specifying optional elements which I do not know the names of and which I in general am not interested in parsing.
The structure of the XML documents is like:
<TRADE>
<TIME>12:12</TIME>
<MJELLO>12345</MJELLO>
<OPTIONAL>12:12</OPTIONAL>
<DATE>25-10-2011</DATE>
<HELLO>hello should be ignored</HELLO>
</TRADE>
The important thing is, that:
I can not assume any order, and the next XML document instance migtht have tags in a different order
I am only interested in parsing some of the tags, some are mandatory and some are optional
The XML documents can be extended with new elements which I am not interested in parsing
The structure of my XSD is like (not a valid xsd):
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- *********************************************** -->
<!-- Trade element definitions for the XML Documents -->
<!-- *********************************************** -->
<xs:complexType name="Trade">
<!-- Using the all construction ensures that the order does not matter -->
<xs:all>
<xs:element name="DATE" type="xs:string" minOccurs="1" maxOccurs="1" />
<xs:element name="TIME" type="xs:string" minOccurs="1" maxOccurs="1" />
<xs:element name="OPTIONAL" type="xs:string" minOccurs="0" maxOccurs="1" />
<xs:any minOccurs="0"/>
</xs:all>
</xs:complexType>
<!-- TRADE is the mandatory top-level tag -->
<xs:element name="TRADE" type="Trade"/>
</xs:schema>
So, in this example: DATE and TIME are mandatory (they must be in the XML exactly once), OPTIONAL might be present once and then I would like to specify, that all other tags are allowed. The order does not matter.
How do I specify a valid XSD for this?
This is a classic parser problem.
Basically, your BNF is:
Trade = whatever whatever*
whatever = "DATE" | "TIME" | anything
anything = a-z a-z*
But this is ambigous. The string "DATE" can both be accepted under the whatever rule as "DATE" and as anything.
So if you have
<TRADE>
<TIME>12:12</TIME>
<DATE>25-10-2011</DATE>
<DATE>25-12-2011</DATE>
</TRADE>
it is unclear whether that should be accepted or not.
It could be interpreted either one of
"TIME", "DATE", anything
anything, anything, "DATE"
anything, anything, anything
"TIME", "DATE", anything
"TIME", "DATE", "DATE"
etc.
It all boils down to: If you have a wildcard combined with random sequence, you cannot meaningfully decide which token matches which rule.
It especially does not make sense to have optional elements together with a wilcard.
You have two options:
use xs:sequence instead of xs:all
do not use wildcard
As I understand it, both options are in conflict with your wishes.
Perhaps you can construct a wildcard that matches everything except DATE, TIME etc.
Is it a hard requirement to have JAXB bindings to your "known" elements?
If not, you can basically have just <any maxoccurs="unbounded" processContents="skip"/> as your xsd, and then pick out the elements you are interested in from the DOM tree.
(See here how to use JAXB without data binding.)

How to specify read only text or label in an XML data schema

I am referring to XML data schema as detailed here: http://www.w3schools.com/schema/default.asp.
When I retrieve data from the database and submit it to the client, there are text fields which I wish to retain as uneditable display/read only fields.
For example, hypothetically in the following sequence,
<xsd:element ....
<xsd:element name="employeeName" xsd:type="xsd:string"/>
<xsd:element name="projID" xsd:type="xsd:string" readOnly='true'>
<xsd:element name="hireDate" type="xsd:date"/>
<xsd:element ....
<xsd:element name="today" type="xsd:date" readOnly='true'/>
<xsd:element ....
Where the client display would interpret the xsd stream and construct the input form.
Of course, the label is faux schema tag to illustrate my need of placing a read-only field in the midst of the form.
In the above example, projID and today should be presented to the user as read-only fields, but there is no such schema syntax as readOnly.
One way I know how to achieve this is to break the stream into two complex type segments and so split it into two input forms and get the client to present an intervening label in between the two forms.
However, that is problematic because
I have quite a few read-only info
fields that need to be presented
thro the ui. There would be too many
interruptions to a smooth single
form. It's best to have just one input form.
Some read-only fields are
presented in the middle of an entity
sequence. That means interrupting
the database-to-jdo(or
jpa)-to-client data flow for that entity.
Therefore, how to specify a read-only field element in an xml schema?
... and (sheepishly) may I ask, how to specify a hidden field?
You may use XML Schema annotations to provide such information for your application. It's awkward, but it could work.
Something along the lines of:
<xs:element name="heading" type="xs:string">
<xs:annotation>
<xs:appinfo>
<readOnly>true</readOnly>
</xs:appinfo>
</xs:annotation>
</xs:element>

Resources