XSD Class generators: keeping track of element order - xsd

I got the following complex type within my XSD schema
<xs:complexType name="structure" mixed="true">
<xs:choice maxOccurs="unbounded">
<xs:element type="b" name="b" />
<xs:element type="a" name="a" />
</xs:choice>
</xs:complexType>
which allows me to state XML definitions like this:
<structure>
Hello <b>World</b>
Hello 2 <b>World 2</b>
<a>Hello3</a> <b>World3</b>
</structure>
Now I tried to generate XSD classes out of my schema, I tried both XSD.exe as well as XSD2Code. They both generate something like
class structure {
List<a> a;
List<b> b;
List<string> text;
}
My problem is, that I need to keep track in which order those elements where defined within the XML content of structure. Refering to the above example, I would like to know that the inner text "Hello" comes right before the first occurance of the b-element.
As this would obviously require a more specialized generator strategy, maybe I'm expecting too much, but: is there any XSD generator that can handle the object order or do I have to write my own classes?
Thank you in advance

I have never seen an XSD to code binding tool which would do what you need here, for sure not on the .NET platform - which you seem to imply as the target. This is one of those cases where roundtrip an XML is not possible, without loss of fidelity (deserialize, serialize then compare, it fails). Just for completeness, the /order option wouldn't work with xsd.exe, simply because in terms of the XSD you defined, there's no order really. It is, also, a limitation of what XSD can describe, which inevitably is reflected in tool implementations.

Related

Making an xsd schema extensible with a typed element

Given a schema that defines an element of a certain type, is it possible to allow that type to be extended, but still have that extension element be strongly typed? In other words, add some kind of extension point that can be used from an external schema to add elements that can only be used in this location?
Let's say the schema looks kinda like:
<xs:schema …>
<xs:element name="Match" type="tns:TNodeConstraint" />
<xs:complexType name="TNodeConstraint">
<xs:group ref="tns:Expression" />
</xs:complexType>
<xs:group name="Expression">
<xs:choice>
<xs:element name="And">
<xs:complexType … />
</xs:element>
<xs:element name="Or">
<xs:complexType … />
</xs:element>
<xs:element name="IsAbstract">
<xs:element name="IsExtern">
<!-- Some kind of extension point? -->
</xs:choice>
</xs:group>
</xs>
Is it possible to extend the Expression group so that a second, external schema could say that I can accept IsMyCustomConstraint here, but not IsMyCustomSortOrder? So this will be valid:
<Match>
<IsAbstract />
<IsExtern />
<IsMyCustomConstraint />
</Match>
But this would be invalid?
<Match>
<IsAbstract />
<IsExtern />
<IsMyCustomSortOrder />
</Match>
I don't want to use xs:any as that would allow putting a "sort order" where a constraint can go.
I can modify the original schema
I'm in control of what the namespaces of IsMyCustomConstraint and IsMyCustomSortOrder would be, and it's not important if they match the original schema or not.
is it possible to allow that type to be extended, but still have that extension element be strongly typed?
Definitely - this is described in detail, with examples, here: https://www.w3.org/TR/2000/WD-xmlschema-0-20000225/#DerivExt. As far as I can tell, you just need to declare one or more complex types that are extensions of your base type 'TNodeConstraint'.
XML schema has a rich set of facilities to support type inheritance including:
abstract base types (base type must be extended or restricted before use)
extension (new type allows more values than the base type)
restriction (new type allows fewer values than the base type)
control of whether further extensions/restrictions are allowed (final/block attributes)
I don't see any need to use a separate XSD for the extensions, although you can if you want to. You may find it useful to know about xsi:type, abstract types and the block/final attributes - all are described the XML Schema Part 0 - Primer mentioned above.

How do I change the case of XSD element name when generated as #XmlElement via xjc

I have a schema where element names are defined in PascalCase eg:
<xsd:element name="EmployeeName" minOccurs="0" maxOccurs="1">
But I would like this to generate as:
#XmlElement(name = "employeeName")
I know this sounds slightly strange but it will then allow me to use Jackson JAXB annotation support to have my JSON generated in camelCase.
Is this possible?
Yes, it is possible to change the XML element name to (nearly) whatever you want by way of the annotation directives.
In this example, "price" is renamed to "itemprice". Java is not case insensitive, so your camel casing will be honored.
//Example: Code fragment
public class USPrice {
#XmlElement(name="itemprice")
public java.math.BigDecimal price;
}
<!-- Example: Local XML Schema element -->
<xs:complexType name="USPrice"/>
<xs:sequence>
<xs:element name="itemprice" type="xs:decimal" minOccurs="0"/>
</sequence>
</xs:complexType>
This example comes from the JAXB javadocs.
Note that JAXB support has a few "how to use it" workflows, "XSD to generate supporting Java classes", and "Java classes to generate an XSD". I prefer the latter, but you may be using the former. If you are, then you need to alter your XSD to have camelCase element names there.
There is no "use my XSD to generate Java classes, but with some overrides" workflow. Perhaps that's where the confusion comes from. Likewise, there is no "Use Java class annotations to generated XSD documents, but with some overrides". What you specify is going to be what you get, name-wise.

Is it possible for XML to have valid schema but no XML document?

I get doubt that are there some schemas which have a valid schema but don't have some XML documents?
If there are, could you please give me some examples?
Yes, it's possible to define schemas for which the set of valid documents is the empty set -- at least, in every schema language I know, and given a reasonable definition of "valid document".
In XSD, RNG, and DTDs, perhaps the simplest such schema is one which declares no elements. In XSD, this could be expressed this way:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
A simple non-vacuous schema can be unsatisfiable by declaring elements with unsatisfiable types:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="unsatisfiable">
<xs:complexType>
<xs:choice/>
</xs:complexType>
</xs:element>
</xs:schema>
Since xs:choice requires that at least one child of the choice be matched by the input, a choice with no children is unsatisfiable. And if the choice is required, as it is here, then the type as a whole is unsatisfiable.
Empty choices can also be used in Relax NG, though not in DTDs.
In Relax NG, it's also possible to declare satisfiable elements in a schema with no valid instances, as long as the root element or at least one required descendant of the root element is unsatisfiable. In XSD, by contrast, once you have any satisfiable element declarations or type definitions you no longer have an empty language: XSD provides no way to say, in the schema, what the outermost element must be at validation time.
In XSD, RNG, and DTDs it is also possible to make an element unsatisfiable by requiring that it contain undeclared elements. In DTD notation:
<!ELEMENT unsatisfiable (undeclared) >
Also, in any of these languages, it's possible to define schemas which are satisfiable only by infinite documents:
<!ELEMENT e (e) >
In XSD (and in Relax NG using XSD datatypes) it's possible to define empty simple types, too:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="unsatisfiable">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minExclusive value="0"/>
<xs:maxExclusive value="1"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:schema>
Some methods of defining empty types are forbidden by XSD: setting the minimum and maximum exclusive values to the same value, for example, will raise an XSD error. (The rationale here is that the majority in the WG thought that an empty type made no sense and also suffered from the illusion that they could effectively prevent the definition of empty types, at least in cases not involving regular-expression patterns. As the example shows, they were wrong.) In XSD 1.1, perhaps the cleanest and most obvious way to define a non-satisfiable simple type is to define an empty union type, but an even simpler way is to use the predefined xs:error type (which itself is defined as an empty union). This is not possible in XSD 1.0, which requires that unions have at least two member types.

XML Schema within a schema

The question in short - "can I define a schema within a schema which can be validated as a whole?
Explanation:
Is it possible to define a schema for the following XML. I need to define a schema for "customer". The "customertype" child element itself is a schema. Within the customertype I should have an element called "Source" which is mandatory.
<customer>
<customername>acustomer</customername>
<customertype>
<xs:schema>
<xs:element name="profession">
<xs:complexType>
<xs:sequence>
<xs:element name="Source" type="xs:int" />
<xs:element name="ProfessionName" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:schema>
</customertype>
</customer>
Is it possible to Define the schema for this xml so that all the requirements are met?
As Mimo has pointed out, there's no problem defining customertype (or another other element) as containing elements in the XSD namespace, which is what you appear to be asking about.
But if your goal is to be able to validate customer elements (or profession elements, which is what the schema in your example declares), it's hard to imagine a validation architecture in which that's the best way (or even a workable way) to go about it. One reason is that validating a document instance against schema information provided by the instance being validated doesn't produce the same confidence in the data's cleanliness as validating it against a known schema. (Put yourself in the shoes of an adversary seeking to subvert your validation and persuade your system to accept bogus data as valid. If the adversary gets to specify what counts as a valid document instance, how useful is it to know that the document is valid?)
What is it that prevents you from writing a schema and using it in the usual way?
[Addition, 15 October 2012, after OP's comment]
If I've understood your comment of earlier today correctly, your requirement is to allow people other than you to specify the type of the customer element however they like, subject to the proviso that that type must contain a child element named Source, whose type will be xsd:int. You don't specify whether you need access to the type definition they are using or not, so I'll try to consider both the case where you do need it and the case where you don't need it.
Three ways to make this situation work are described below. They have in common that there is
a 'main' schema document that defines a basic version of the schema, and
one or more 'auxiliary' schema documents for use in different situations.
In general, you may find it helpful to find a good textbook on XSD and review what it says about creating a schema from declarations in several schema documents.
(1) One approach uses xsi:type. You define a main schema document in which the customer element has a named type; I'll assume the type is named Customer. The Customer type accepts any element whose first child element is named Source. For example:
<xs:element name="customer" type="Customer"/>
<xs:complexType name="Customer">
<xs:sequence>
<xs:element name="Source" type="xs:int"/>
<xs:any minOccurs="0"
maxOccurs="unbounded"
processContents="lax"/>
</xs:sequence>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
Those who want a more specific type for the customer element (I'll call them the 'users') provide auxiliary schema documents for your target namespace in which they declare other complex types which restrict Customer. For example, they might want the customer element to contain elements called name, address, and phone number:
<xs:complexType name="Customer-for-us">
<xs:complexContent>
<xs:restriction base="Customer">
<xs:sequence>
<xs:element name="Source" type="xs:int"/>
<xs:element ref="name"/>
<xs:element ref="address"/>
<xs:element ref="phone"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
This is a legal restriction of the Customer type, so the customer element can use it. A document instance might therefore contain an element like:
<customer xsi:type="Customer-for-us">
<Source>83760273</Source>
<name>Willy Wonka</name>
<address> ... </address>
<phone> ... </phone>
</customer>
The document is validated against the schema constructed from their auxiliary schema document, together with the main schema document, so the definition of type Customer-for-us is enforced in the usual way.
By using wildcards and lax validation, the Customer type ensures that users can do anything they like in their version of the type, as long as the first child is named Source and has type int.
(2) A second approach uses a hole in the main schema document.
You write a main schema document as before, including the declaration of the customer element as having type Customer. But the main schema document does not contain a declaration for that type. Instead, you declare the Customer type in an auxiliary schema document, which is combined with the main one at validation time in the usual way (I'd recommend you have a third schema document which serves as a driver and includes the other two, but there are many ways to make it work).
The users who want a more specific Customer type, meanwhile, write their own declaration for the Customer type, subject to the compatibility constraints about the first child named Source and so on. The users use their own driver file, which embeds the main schema document and their version of the auxiliary schema document with their own declaration of the Customer type.
This way, the xsi:type attribute does not need to be used.
(3) A third approach uses the xs:redefine or (in XSD 1.1) the xs:override facility.
You write the main schema document as described in solution (1). The users use xs:redefine or xs:override to redefine Customer as they wish. This answer is already rather long, so I do not propose to include a tutorial on the use of either redefine or override.
It is possible to create a schema importing and using another schema. This defines your customer element with customertype containing a schema:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://xml.netbeans.org/schema/Notes"
xmlns:tns="http://xml.netbeans.org/schema/Notes"
elementFormDefault="qualified">
<xsd:import namespace="http://www.w3.org/2001/XMLSchema"/>
<xsd:element name="customer">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="customername" type="xsd:string"/>
<xsd:element name="customertype">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="xsd:schema"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
The problem is that you have an additional condition on the customertype schema - so you should in theory get the standard XSD schema and modify it, but there are many different ways to satisfy that condition in a schema definition, so it is very tricky (and maybe impossible) to do this 'modification'
Probably a better approach is restrict the possible schemas used inside customertype (e.g. it must be a single element definition with complex type specified directly etc. etc) and write a sub-set of the XSD schema that describe this restricted schema definition.

Unmarshalling based on Concrete Instance

I am a new comer to JaxB World and I am facing one problem w.r.t. unmarshalling of the stored xml content into java class object. Problem description is as follows. Let me know if this is solvable
I have my xsd file which contains following content(this is just a example)
Student info
<xs:complexType name="specialization" abstract="true">
</xs:complexType>
<xs:complexType name="Engineering">
<xs:complexContent>
<xs:extension base="specialization">
<xs:sequence>
<xs:element name="percentage" type="xs:int" minOccurs="0"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="Medical">
<xs:complexContent>
<xs:extension base="specialization">
<xs:sequence>
<xs:element name="grade" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Now all the corresponding java classes are generated by compiling the xsd. Now lets assume in my application i will set the specialization attribute of Student info by constructing Engineering class instance. So after all the operation when i save
the xml file that get saved will have the entry like below
<Student>
<Name>Name1</Name>
<Specialization>
<percentage>78<percentage>
</Specialization>
</Student>
Now when the above content goes for unmarshalling, unmarshalling fails saying unexpected element . I guess this is b'cos Specialization element is of type specialization it calls unmarshalling on itself rather than derived object which is stored.
I hope my explanation is clear. Is there any way that we can unmarshall based on derived class instanse type. The xsd and bindings.xjb file is completely in my control so i can add or modify any entries/info which conveys to unmarshalling rules to unmarshall on derived class.
Thanks for your Suggestion but the it still not working for me.
Here is what I tried
Option #1 - xsi:type
My xsd looks same as what is explained in the example but still the Xsi:type doesn't come in the resulted xml. Do i need to add any other setting while compiling? Which JaxB version should i use for this?
Option#2 - Substitution Groups
When i added the substitution entry part in my xsd, XSD compilation failed saying duplicate names "Engineering" and "Medical". I guess element name and type Name being same compilation cribs(All engineering, Medical,specialization being same both in type definition and element Name)
I can't modify the generated classes as we are using Model driven Architecture. Only thing that is in hand is xsd. Any modification to the xsd is allowed. Ideally First option should have worked. But can't figure out why it is not working. Let me know if you have some suggestion to narrow down the problem.
There are different ways of representing Java inheritance in XML when using JAXB:
Option #1 - xsi:type
In this representation an attribute is used to indicate the subtype being used to populate this element.
<Student>
<Name>Name1</Name>
<Specialization xsi:type="Engineering">
<percentage>78<percentage>
</Specialization>
</Student>
For a detailed example see:
http://blog.bdoughan.com/2010/11/jaxb-and-inheritance-using-xsitype.htmlhtml
Option #2 - Substitution Groups
Here an element name is used to indicate the subtype. This corresponds to the schema concept of substitution groups and leverages JAXB's #XmlElementRef annotation:
<Student>
<Name>Name1</Name>
<Engineering>
<percentage>78<percentage>
</Engineering>
</Student>
For a detailed example see:
http://blog.bdoughan.com/2010/11/jaxb-and-inheritance-using-substitution.html

Resources