Xerces-J xsd:base64binary lexical validation question - xsd

I've recently upgraded my project from Xerces-J 2.7.0 to Xerces-J 2.12.1 and I'm seeing a change in schema validation behaviour. I'm not entirely clear if my test is wrong or Xerces is.
Given this schema:
<?xml version='1.0'?>
<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
<!-- Schema to test facets for the xsd:base64Binary datatype. -->
<xsd:element name="facetTest" type="FacetTestComplexType"/>
<xsd:complexType name="FacetTestComplexType">
<xsd:sequence>
<xsd:element name='enumeration' type='EnumerationType' minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<!-- ***** Enumeration ***** -->
<xsd:simpleType name='EnumerationType'>
<xsd:restriction base='xsd:base64Binary'>
<xsd:enumeration value='Ab1+'/>
<xsd:enumeration value='7 d Ec'/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
And this instance document:
<facetTest>
<enumeration>7dEc</enumeration>
</facetTest>
With Xerces-J 2.7.0 that instance document would be valid, however when using Xerces-J 2.12.1 it now is flagged as invalid.
I reviewed the schema base64binary specification and it's left me unclear on whether this should be valid (my code is right and Xerces-J is wrong) or visa versa. This is the passage that has thrown me:
Note that this grammar requires the number of non-whitespace characters in the lexical form to be a multiple of four, and for equals signs to appear only at the end of the lexical form; strings which do not meet these constraints are not legal lexical forms of base64Binary because they cannot successfully be decoded by base64 decoders.
Note: The above definition of the lexical space is more restrictive than that given in [RFC 2045] as regards whitespace -- this is not an issue in practice. Any string compatible with the RFC can occur in an element or attribute validated by this type, because the ·whiteSpace· facet of this type is fixed to collapse, which means that all leading and trailing whitespace will be stripped, and all internal whitespace collapsed to single space characters, before the above grammar is enforced.
According to the definition of enumeration, it restricts the value-space, not the lexical-space. In that case it seems the value-space appears to cover the original binary content. If that's the case, then the whitespace should be meaningless.
Any clarification on whether my code or Xerces is incorrect would be greatly appreciated.

I think your code is correct, and Xerces has started to behave incorrectly.
Although the base64 values in your enums look strange, they do conform to the grammar specified here: https://www.w3.org/TR/xmlschema-2/#base64Binary
This is what the XSD spec says about enumeration facets:
Validation Rule: enumeration valid:
A value in a ·value space· is facet-valid with respect to ·enumeration· if the value is one of the values specified in {value}
So I agree with your statement:
According to the definition of enumeration, it restricts the value-space, not the lexical-space. In that case it seems the value-space appears to cover the original binary content.

Related

Is technically valid a positiveInteger restriction with maxInclusive 9999999999?

I'm working with a web service from an external company, which has defined the following restriction to an element in their wsdl:
<xs:simpleType>
<xs:restriction base="xs:positiveInteger">
<xs:minInclusive value="1"/>
<xs:maxInclusive value="9999999999"/>
</xs:restriction>
</xs:simpleType>
Doing the conversion of this restriction in a class, I created a property with the tipe UInt32, but this data type only allows numbers up to 4294967295, very lower than the maxInclusive defined in the restriction.
This kind of restriction is technically and logicaly valid for a schema? or is wrong and the external company should change the base type to a bigger one?
Thanks in advance.
The restriction is fine. Have a look at the W3C standard.
[Definition:] positiveInteger is ·derived· from nonNegativeInteger by setting the value of ·minInclusive· to be 1. This results in the standard mathematical concept of the positive integer numbers. The ·value space· of positiveInteger is the infinite set {1,2,...}. The ·base type· of positiveInteger is nonNegativeInteger.
What they probably mean this value to be is an xs:unsignedInt or xs:unsignedLong, but technically its correct.

XML encoding of Attribute in KMIP

I'm analyzing KMIP to implement a prototype in scala. I try so to understand all concepts to implement an architecture for different encoding profiles (bytes, JSON, XML).
In specification section 5.4.1.6 XML Element Encoding, it stipulates :
[...] structure values are encoded as nested xml elements, and non-structure
values are encoded using the ‘value’ attribute
With this example :
<ActivationDate type="DateTime" value="2001-01-01T10:00:00+10:00"/>
I don't understand this syntax since Activation Date is an attribute. In section 2.1.1 Attribute an attribute is described with a structure containing Attribute Name, Attribute Index, Attribute Value.
The XML representation of an ActivationDate or other attributes should be :
<Attribute>
<AttributeName type="TextString" value="Activation Date"/
<AttributeValue type="DateTime" value="2001-01-01T10:00:00+10:00"/>
</Attribute>
Moreover, the KMIP test case uses this second representation.
If the first representation is shown as an example, it will be used. So in which case ?
The KMIP specification is very vague on this point. BOTH forms of Attribute you described are considered valid KMIP and should be handled.
I strongly recommend the KMIP Additional Message Encodings document when implementing http/json/xml encoding- https://docs.oasis-open.org/kmip/kmip-addtl-msg-enc/v1.0/os/kmip-addtl-msg-enc-v1.0-os.html
section 6.1.6 describes yet another format that isn't covered in the main spec: <TTLV tag="0x420001" name="ActivationDate" type="DateTime" value="2001-01-01T10:00:00+10:00"/>

JAXB messing up encoding in Mule flow

I'm running a flow in Mule CE and have huge problems with encodings. No matter what I do my files end up with messed up non-english charcters.
Before the jaxb-object-to-xml transformer my payload looks nice in the console and in the debugger, but after that it's all messed up.
......
<http:request>
<object-to-byte-array-transformer encoding="UTF-8" doc:name="Object to Byte Array"/>
<object-to-string-transformer doc:name="String" encoding="UTF-8"/>
<json:json-to-object-transformer returnClass="java.util.List" doc:name="JSON2ObjectList" encoding="UTF-8"/>
<collection-splitter doc:name="Collection Splitter"/>
<choice doc:name="Choice">
<when expression="....">
<custom-transformer returnClass="se.system.Order.SalesHeader" class="se.system.Transformer.Map2Order" doc:name="Map2Order" mimeType="application/xml" encoding="UTF-8"/>
<mulexml:jaxb-object-to-xml-transformer name="orderMarshaller" jaxbContext-ref="JAXB_Context" doc:name="orderMarshaller" mimeType="text/xml" encoding="UTF-8"/>
<object-to-string-transformer doc:name="XML2String" encoding="UTF-8"/>
<set-variable variableName="fileName" value="order-#[function:dateStamp].xml" doc:name="fileName" encoding="UTF-8"/>
<file:outbound-endpoint path="${file.ToOrder}" responseTimeout="10000" doc:name="File" outputPattern="#[fileName]" mimeType="text/xml" encoding="UTF-8"/>
After the jaxb transformer non-english characters looks like:
Deliveryinfo2="å ä ö Å Ä Ö & % è É"/
And the 010 editor claims its ANSI DOS (with messed up characters, don't know if that one is to be trusted though)
Have I missed something in the jaxb transformer? or somewhere else?
Is it possible to replace it with a Java component, initiate my very own JAXB context, get a marshaller and handle it myself?
No clues anymore...
Regards
EDIT: this one can handle non-english characters
<mulexml:object-to-xml-transformer doc:name="Object to XML" encoding="UTF-8" />
but not GregorianCalendar types or my main Objects List of other objects so it's not an alternative
This seems to be a bug caused by the JAXB transformer not respecting the given encoding, see source (line 64).
What however is kinda weird is that according to the JAXB documentation the default encoding should be UTF-8.
Encoding
By default, the Marshaller will use UTF-8 encoding when generating XML data to a java.io.OutputStream, or a java.io.Writer. Use the setProperty API to change the output encoding used during these marshal operations. Client applications are expected to supply a valid character encoding name as defined in the W3C XML 1.0 Recommendation and supported by your Java Platform.
This should probably be something like this
final Marshaller m = jaxbContext.createMarshaller();
m.setProperty(Marshaller.JAXB_ENCODING, encoding);

How to not let gSoap add "USCORE" after each underscore in a field's name?

I'm creating web service by gSoap, using existing WSDL and the necessary schemas as argument for the command wsdl2h.
I have in my schema the element i_ID declared this way :
<xs:element minOccurs="0" name="i_ID" nillable="true" type="xs:string" />
But gSoap rename the attribute to i_USCOREID :
/// Element i_ID of type xs:string.
char* i_USCOREID
And I noticed it happens the same for all the fields after each _.
Do you know guys how to fix this? Because this reduces the readablity and I'm not right to change the .XSD file. Maybe I should add an option to the command wsdl2h?
Thank you!
use the "-_" flag when using the wsdl2h
"-_ don't generate _USCORE (replace with UNICODE _x005f)"

Empty elements for primitve datatypes forbidden in XSD

I encountered a parsing error with Apache CXF while processing a webservice response. What it comes down to is an empty element being returned:
<myValue />
The element definition is as follows:
<xsd:element name="myValue" type="xsd:float" minOccurs="0">
Now I've read on the CXF mailing list that an empty value is not allowed by the XSD-spec:
Well, there isn't a workaround for
this as it's not a bug. An empty
element is not valid for any Decimal
or Date type or anything like that.
Thus, it SHOULD throw an exception.
What are you expecting it to do?
Now here comes the question: Where exactly can I find this constraint in the XML Schema specification?
Where exactly can I find this constraint in the XML Schema specification?
http://www.w3.org/TR/xmlschema-2/#float-lexical-representation
float values have a lexical
representation consisting of a
mantissa followed, optionally, by the
character "E" or "e", followed by an
exponent.
...
The representations for exponent and
mantissa must follow the lexical rules
for integer and decimal.
...
The special values positive and
negative infinity and not-a-number
have lexical representations INF, -INF
and NaN, respectively.
So xs:float requires at least a mantissa that is a xs:decimal...
decimal has a lexical representation
consisting of a finite-length sequence
of decimal digits (#x30-#x39)
separated by a period as a decimal
indicator. An optional leading sign is
allowed.
...and an empty string is not a valid xs:decimal.
If you don't have a value for this element, you should try not including this element, if possible. Your schema seems to allow omitting this element because minOccurs has value 0. Other solution would be to insert a suitable replacement value, like 0 or NaN.
This is not a definitive constraint.
You should be able to change your xsd to
<xsd:element name="myValue" type="xsd:float" minOccurs="0" default="0" />
And then be able to supply an empty element for your float without causing your xml to be invalid.
The above example means that if the element is empty, then its value is 0. Beware, default attribute does not apply on missing elements: missing elements are just missing, whether they have a declared default or not.
http://www.w3.org/TR/xmlschema-0/#OccurrenceConstraints
if the element appears without any content, the schema processor provides the element with a value equal to that of the default attribute. However, if the element does not appear in the instance document, the schema processor does not provide the element at all.
I have not used this till now, but to guard against a personal miss-reading of w3c specs, I have check with an online validator that an xml with an empty xs:float element having a default was accepted (at least by this online validator: http://www.freeformatter.com/xml-validator-xsd.html ).

Resources