How to allow typed values to be empty with an XML schema? - xsd

I have some XML documents over which I have no control whatsoever. Their structure is well-defined, but it is described in a bunch of PDFs, which, despite being very exact, don't make automated validation very tractable. I'm trying to write a XML schema to make (most of) the rules in those PDFs executable.
All the elements are mandatory. But about half of them can be either empty or have simple typed content.
When defining datatypes for these elements, I defined two versions of each: a "normal" one, and another that can be empty. I did this by defining unions with an empty datatype:
<xs:simpleType name="empty">
<xs:restriction base="xs:string">
<xs:length value="0"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="codPostal">
<xs:restriction base="xs:string">
<xs:pattern value="^[0-9]{4}-[0-9]{3}$"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="opt_codPostal">
<xs:union memberTypes="empty codPostal"/>
</xs:simpleType>
Is there a less repetitive way of doing this?

You can use xs:nillable.
In XSD
<xs:simpleType name="codPostal">
<xs:restriction base="xs:string">
<xs:pattern value="^[0-9]{4}-[0-9]{3}$"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="OptionalString" type="codPostal" nillable="true" />
In Document
<OptionalString xsi:nil="true"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" />
This is most useful for non-string types (e.g. datetime etc) as for strings you could just use zero length.
<OptionalString />
Unfortunately you need to specify the "nil" attribute on the document. As far as I know, the only non-intrusive way to do what you want is the union type approach that you've already chosen.

Related

xsd restriction allowing any lowercase or uppercase or numbers

I have a following schema accepting only COM1-4.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:complexType name="Component">
<xs:attribute name="type" type="ComponentType"/>
</xs:complexType>
<xs:simpleType name="ComponentType">
<xs:restriction base="xs:string">
<xs:enumeration value="COM1"/>
<xs:enumeration value="COM2"/>
<xs:enumeration value="COM3"/>
<xs:enumeration value="COM4"/>
</xs:restriction>
</xs:simpleType>
Now the requirement has changed. I need to allow any characters(lower or uppercase). I need to make it backward-compatible with the previous values which are already populated in the production env.
I want to preserve 'ComponentType' name and enum types for ComponentType. I would like to minimize the code change. If possible, I would like to take care of it in the schema level. These COM[1-4] is just naming convention changed for this post. Actual values are different.
Is there a way to change the schema which can accept the any characters(lower or uppercase) within the scope of schema without regenerating the java artifacts or make a change on java code?
Thanks.
I think this will work, but I have no idea on the impact on your Java code:
<xs:simpleType name="ComponentType">
<xs:restriction base="xs:string">
<xs:pattern value="[cC][oO][mM][1-3]"/>
</xs:restriction>
</xs:simpleType>

SimpleType - Aggregation or Abstraction

I face a weird problem. I'm supposed to use a number of third party XSDs for a webservice. My framework of choice is Apache CXF, and I generate code using its Maven plugin. So far so good, everything works, neither generation nor webservice itself is problematic.
But, since the authors of the XSDs are weird and I cannot change them myself, I face a problem: They use a lot of basically duplicated SimpleType-definitions. They all bear their own name, but do the same thing.
Example:
<xs:simpleType name="VehicleFittedWithEcoInnovInd">
<xs:restriction base="xs:string">
<xs:maxLength value="1"/>
<xs:enumeration value="Y"/>
<xs:enumeration value="N"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="TypeApprTranspDangerGoodsInd">
<xs:restriction base="xs:string">
<xs:maxLength value="1"/>
<xs:enumeration value="Y"/>
<xs:enumeration value="N"/>
</xs:restriction>
</xs:simpleType>
And many more (Numbers, Stringdefinitions etc etc).
So the question is, is it possible, through a jaxb-plugin or similar, to aggregate these SimpleTypes into one, or at least generate an abstract class structure, so that the amount of irrelevant duplicate code is reduced?

XSD custom type with attribute and restriction

I am developing an XSD document to validate XML Import files. Nearly all elements of the import file 'can' have an ID attribute (UPDATE). The UPDATE attribute must be limited to 4 possible values, so I have this pre-set type to use for the attribute restriction...
<xs:simpleType name="MyUpDir">
<xs:restriction base="xs:string">
<xs:enumeration value="OVERWRITE"/>
<xs:enumeration value="ADDONLY" />
<xs:enumeration value="NOERASE" />
<xs:enumeration value="IGNORE" />
</xs:restriction>
</xs:simpleType>
In addition to the attribute restrictions, each element's value is limited by a variety of pre-set custom types
Example:
<xs:simpleType name="MyChar50">
<xs:restriction base="xs:string">
<xs:maxLength value="50" />
</xs:restriction>
</xs:simpleType>
To combine the two, I know I can do it in-line for each element as follows:
<xs:element name="FullName">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="MyChar50">
<xs:attribute name="UPDATE" type="MyUpDir" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
The problem is that there are over 1000 elements in the import file, each having varying length/regEx/precision restrictions (roughly 20 custom types) as well as have the potential for the UPDATE attribute. Without the UPDATE attribute, I could do each element on its own line by using the custom types, greatly reducing the 'content' portion of the XSD. But from what I've read, it appears that to accomodate the custom types AND the potential for the attribute mentioned, I'm forced to use the expanded sample (last example) instead of being able to retain a single line for each such element. Is there not a way to minimize this further by creating a custom type that combines the two?
I would think that you could do 20 custom types more (for a total of 40) and then use the appropriate ones (w/ or w/o attribute). In your case:
<xs:complexType name="MyChar50Attr"><!-- This one has attributes -->
<xs:simpleContent>
<xs:extension base="MyChar50">
<xs:attribute name="UPDATE" type="MyUpDir"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="FullName" type="MyChar50Attr"/>

XSD: restrict attribute to xs:float or ""

I'm trying to define an element type in XSD, for which i want an optional attribute, which if present can either contain a float, or be empty (but still present).
i.e:
<xs:element name="MyElement">
<xs:complexType>
<xs:attribute name="optionalFloatAttribute" type="xs:float" use="optional"/>
</xs:complexType>
</xs:element>
Needs "fixing" to allow all of the following xml:-
<MyElement/>
or
<MyElement optionalFloatAttribute=""/>
or
<MyElement optionalFloatAttribute="3.14159"/>
The only way I can see of doing this is to change type to xs:string, and use xs:restriction with a regular expression. But this doesn't seem very ideal to me. Is there a better way?
And I have to be able to support these variations of the xml - the program and existing xml is legacy, and I am trying to back-create a schema to match the myriad variations I see in what we have to regard as valid xml.
You can define custom type for that by combining float and empty string:
<xs:element name="MyElement">
<xs:complexType>
<xs:attribute name="optionalFloatAttribute" type="emptyFloat" use="optional"/>
</xs:complexType>
</xs:element>
<xs:simpleType name="emptyFloat">
<xs:union>
<xs:simpleType>
<xs:restriction base='xs:string'>
<xs:length value="0"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType>
<xs:restriction base='xs:float'>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
Or using regExp:
<xs:simpleType name="emptyFloat">
<xs:restriction base="xs:string">
<xs:pattern value="-?\d*\.?\d*"/>
</xs:restriction>
</xs:simpleType>
If you could stand using an element rather than an attribute you could make the xs:float nillable. This way you can use the xsi:nil="true" in your instance document to indicate that the element has no value:
<!-- definition -->
<xs:element name="quantity" type="xs:float" nillable="true" />
<!-- instance -->
<quantity xsi:nil="true" />
No equivalent for attributes though.
I don't think there's a way to handle this and use xs:float. Fundamentally it comes down to the fact that empty string isn't a valid number. You'd either normally expect a value of 0, or for the element to be missing altogether. There's a good explanation as the answer to the following question:
Empty elements for primitve datatypes forbidden in XSD
It seems that the option of using xs:string and a regexp might be your best plan.

Xsd, Validate empty or minimum length string

Currently I have an Xsd validating with this rule
<xs:simpleType name='shipTo'>
<xs:restriction base='xs:string'>
<xs:minLength value='6'/>
</xs:restriction>
</xs:simpleType>
I need to allow blanks as well, but if a value is entered, it's minimum length should still be 6.
Can I do this without resorting to this xs:pattern and regex?
<xs:simpleType name='shipTo'>
<xs:restriction base='xs:string'>
<xs:pattern value='^(?:|[\w]{6,})$'/>
</xs:restriction>
</xs:simpleType>
The regex will work, but you should really make the element that you will be assigning shipTo to optional, and not include it in the XML file if it has no value.

Resources