xml scheme attribute or complextype conditional - xsd

I would like to allow for an element to either contain an attribute OR define a more complex type.
Something like
<myElement someAttr="..."/>
or
<myElement>
<...>
</myElement>
That is, if someAttr exists then I do not want to allow sub elements and if it doesn't then I want to.
The reason for this is I want to have an "include" feature where I include a file which is essentially inserted into the element. But I don't want both. You can either include additional external xml code into the element or add your own BUT not both. (or also to have it inserted from a separate part of the xml)
This is mainly for simplifying a complex xml so that the structure is easily understood.

I doubt you will be able to express something like that in XML schema at this point.
You can make an attribute optional, e.g. it can be present or not. But you cannot express something like if the attribute is not present, then include other complex content with the current means.
You'll have to either check this programmatically yourself, or maybe investigate if other XML description languages like RelaxNG or Schematron might be able to help.

Perhaps with a static choice and you change the myElement name ?
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:choice>
<xs:element name="myElementWithAttrs">
<xs:complexType>
<xs:attribute name="someAttrs" type="xs:string"/>
</xs:complexType>
</xs:element>
<xs:element name="myElementWithoutAttrs">
<xs:complexType>
<xs:sequence>
<xs:any processContents="skip"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>

Related

Is it possible to define a XSD element with an arbitrary name but with specific attributes

I would like to validate a custom XML document against a schema.
This document would include a structure with any number of elements, each having a specific attribute. Something like this:
<Root xmlns="http://tns">
<Records>
<FirstRecord attr='whatever'>content for first record</FirstRecord>
<SecondRecord attr='whatever'>content for first record</SecondRecord>
...
<LastRecord attr='whatever'>content for first record</LastRecord>
</Records>
</Root>
The author of the XML document can include any number of records, each with an arbitrary name of his or her choosing. How is this possible to validate this against an XML Schema ?
I have tried to specify the appropriate structure type in a schema, but I do not know how to reference it in the appropriate location:
<xs:schema xmlns="http://tns" targetNamespace="http://tns" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="RecordType"> <!-- This is my record type -->
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="attr" type="xs:string" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="1" name="Records">
<!-- This is where records should go -->
<xs:complexType />
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
What you describe is possible in XSD 1.1, and something very similar is possible in XSD 1.0, which is not to say it's an advisable design.
In XML vocabularies, the element type normally conveys relevant information about the type of information, and it is the name of the element type that is used to drive validation in most XML schema languages; the design you describe is (some would say) a little bit like asking if I can define an object class in Java, or a struct in C, which obeys the constraints that the members can have arbitrary names, as long as one of them is an integer with the value 42. That, or something like it, may well be possible, but most experienced designers will feel strongly that this is almost certainly not the right way to go about solving any normal problem.
On the other hand, doing unusual and awkward things with a system can sometimes help in learning how to use the system effectively. (You never know a system well, said a friend of mine once, until you have thoroughly abused it.) So my answer has two parts: how to come as close as possible to the design you specify in XSD, and alternatives you might consider instead.
The simplest way to specify the language you seem to want in XSD 1.1 is to define an assertion on the Records element which says (1) that every child of Records has an 'attr' attribute and (2) that no child of Records has any children. You'll have something like this:
...
<xs:element minOccurs="1" maxOccurs="1" name="Records">
<xs:complexType>
<xs:sequence>
<xs:any/>
</xs:sequence>
<xs:assert
test="every $child in * satisfies $child/#attr"/>
<xs:assert
test="not(*/*)"/>
</xs:complexType>
</xs:element>
...
As you can see, this is very similar to what InfantPro'Aravind' has described; it avoids the problems identified by InfantPro'Aravind' by using assertion, not type assignment, to impose the constraints you impose.
In XSD 1.0, assertion is not available, and the only way I can think of to come close to the design you describe is define an abstract element, which I'll call Record, as the child of Records, and to require that the elements which actually occur as children of Records be declared as being substitutable for this abstract type (which in turn requires that their types be derived from type RecordType). Your schema might say something like this:
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element name="Records">
<xs:complexType>
<xs:sequence>
<xs:element name="Record"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="RecordType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="attr"
type="xs:string"
use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Record"
type="RecordType"
abstract="true"/>
Elsewhere in the schema (possibly in a separate schema document) you will need to declare FirstRecord, etc., and specify that they are substitutable for Record, thus:
<xs:element name="FirstRecord" substitutionGroup="Record"/>
<xs:element name="SecondRecord" substitutionGroup="Record"/>
<xs:element name="ThirdRecord" substitutionGroup="Record"/>
...
At some level, this matches your description, though I suspect you did not want to have to declare FirstRecord, SecondRecord, etc.
Having described ways in which XSD can do what you describe, I should also say that I wouldn't recommend either of these approaches. Instead, I'd design the XML vocabulary differently to work more naturally with XSD.
In the design as you specify it, every record appears to have the same type, but in addition to the content of the element they are allowed to convey a certain additional quantity of information by having a different name (FirstRecord, SecondRecord, etc.). This additional information could just as easily be conveyed in an attribute, which would allow you to specify Record as a concrete element, rather than an abstract element, giving it an extra "alternate-name" attribute. Your sample data would then take a form like this:
<Root xmlns="http://tns">
<Records>
<Record
alternate-name="FirstRecord"
attr='whatever'>content for first record</Record>
<Record
alternate-name="SecondRecord"
attr='whatever'>content for first record</Record>
...
<Record
alternate-name="LastRecord"
attr='whatever'>content for first record</Record>
</Records>
</Root>
This will be more or less acceptable depending on whether you or your data providers or tools in your tool chain attach some mystic or other significance to having the string "FirstRecord" be an element type name instead of an attribute value.
Alternatively, one could say that the point of the design is to allow Records to contain an arbitrary sequence of elements of arbitrary structure (on this account, the restriction to xs:string is just an artifact of your example and is not really desired in reality) as long as we have, for each record, the information recorded in the 'attr' attribute. Easy enough to specify this: define 'Record' as a concrete element with an 'attr' attribute, accepting one child which can be any XML element:
<xs:element name="Record">
<xs:complexType>
<xs:sequence>
<xs:any processContents="lax"/>
</xs:sequence>
<xs:attribute name="attr"
use="required"
type="xs:string"/>
</xs:complexType>
</xs:element>
The value of the 'processContents' attribute can be changed to 'strict' or 'skip' or kept at 'lax', depending on whether you want FirstRecord, SecondRecord, etc. to be validated (and declared) or not.
I guess this is not possible with XSD alone!
When you say
any number of records, each with an arbitrary name of his or her choosing.
That forces us to use <xs:any/> element but! Having an element declared as any doesn't allow you to validate attributes under it..
So .. the answer is NO!

In an XSD file, Is it legal to use two Order Indicators for a given element?

I am creating an XSD for the following XML structure:
<BaseNode>
<ParentNode1>
<childnode/>
</ParentNode1>
<ParentNode2>
<childnode/>
</ParentNode2>
<ParentNodeA>
<childnode/>
</ParentNodeA>
<ParentNodeB>
<childnode/>
</ParentNodeB>
</BaseNode>
Where: ParentNodes 1 and 2 must appear and in order, and A and B are optional (and will only appear once each, if present), but must appear after 1 and 2 if present.
What I 'think' will work is the following, but is it valid? (specifically, the presence of both, sequence and all Order Indicators)
<xs:element name="BaseNode">
<xs:complexType>
<xs:sequence>
<xs:element name="ParentNode1">
....
</xs:element>
<xs:element name="ParentNode2">
....
</xs:element>
</xs:sequence>
<xs:all>
<xs:element name="ParentNodeA">
....
</xs:element>
<xs:element name="ParentNodeB">
....
</xs:element>
</xs:all>
</xs:comlexType>
</xs:element>
I couldn't find any reference (in w3schools.com or elsewhere) to compound use of order indicators, and don't have a validator readily available.
Thank you in advance.
I found the answer at http://www.w3.org/TR/xmlschema-0/#groups
XML Schema stipulates that an all group must
appear as the sole child at the top of a content model.
example provided at the link.

Is mixed inherited when a complexType is extended?

I have the following in a schema:
<xs:element name="td">
<xs:complexType>
<xs:complexContent>
<xs:extension base="cell.type"/>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:complexType name="cell.type" mixed="true">
<xs:sequence minOccurs="0" maxOccurs="unbounded">
<xs:element ref="p"/>
</xs:sequence>
</xs:complexType>
Some parsers allow PCDATA directly in the element, while others don't. There's something in the XSD recommendation (3.4.2) that says when a complex type has complex content, and neither has a mixed attribute, the effective mixed is false. That means the only way mixed content could be in effect is if the extension of cell.type causes the mixed="true" to be inherited.
Could someone more familiar with schemas comment on the correct interpretation?
(BTW: if I had control of the schema I would move the mixed="true" to the element definition, but that's not my call.)
Anyone reading my question might want to read this thread also (by Damien). It seems my answer isn't entirely right: parsers/validators don't handle mixed attribute declarations on base/derived elements the same way.
Concerning extended complex types, sub-section 1.4.3.2.2.1 of section 3.4.6 in part 1 of W3C's XML Schema specification says that
Both [derived and base] {content type}s must be mixed or both must be element-only.
So yes, it is inherited (or more like you cannot overwrite it—same thing in the end).
Basically, what you've described is the desired (and as far as I'm concerned) the most logical behavior.
I've created a simple schema to run a little test with Eclipse's XML tools.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="c">
<xs:complexType>
<xs:complexContent mixed="false">
<xs:extension base="a"/>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:complexType name="a" mixed="true">
<xs:sequence minOccurs="0" maxOccurs="unbounded">
<xs:element name="b"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
The above schema is valid, in the sense that not Eclipse's nor W3C's "official" XML Schema validator notices any issues with it.
The following XML passes validation against the aforementioned schema.
<?xml version="1.0" encoding="UTF-8"?>
<c xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="test.xsd">
x
<b/>
y
</c>
So basically you cannot overwrite the mixedness of a complex base type. To support this statement further, try and swap the base and dervied types' mixedness. In that case the XML fails to validate, because the derived type won't be mixed as it (yet again) cannot overwrite the base's mixedness.
You've also said that
Some parsers allow PCDATA directly in the element, while others don't
It couldn't hurt to clarify which parsers are you talking about. A good parser shouldn't fail when it encounters mixed content. A validating parser, given the proper schema, will fail if it encounters mixed content when the schema does not allow it.

XSD. Different between xsd:element and xs:element?

I reading articles about XSD on w3schools and here many examples. For example this:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
But after I tried put this .xsd file in xjc - I see error log, dome like this:
The prefix "xs" for element "xs:schema" is not bound...
But all work correct when I change xs on xsd prefix.
So, can somebody, clarify for me what is different between xs and xsd?
Maybe, one prefix - it is old version and other for new version...
xs and xsd are XML prefixes used with qualified names; each prefix must be associated with a namespace. The association is done with an attribute that looks like xmlns:xs="...". xs and xsd are most common for XML Schema documents.
Should you choose s or ns1, it shouldn't make any difference to any tool for your scenario.
The error is not caused by your XML Schema file. I suspect there might be something else in your setup, maybe a custom binding file. Please check that or post additional information.

XSD Formatting <element><complexType> vs <complexType/><element/>

This XSD portion was obtained from: http://www.iana.org/assignments/xml-registry/schema/netconf.xsd
<xs:complexType name="rpcType">
<xs:sequence>
<xs:element ref="rpcOperation"/>
</xs:sequence>
<xs:attribute name="message-id" type="messageIdType" use="required"/>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
<xs:element name="rpc" type="rpcType"/>
And is the core to function calls in NETCONF being the node of an XML document. I am curious as to why it is not something like:
<xs:element name="rpcType">
<xs:complexType>
<xs:sequence>
<xs:element ref="rpcOperation"/>
</xs:sequence>
<xs:attribute name="message-id" type="messageIdType" use="required"/>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
</xs:element>
The reasoning is that in #1 when trying to marshall a bean (in jaxb2) I get the exception:
[com.sun.istack.SAXException2: unable to marshal type "netconf.RpcType" as an element because it is missing an #XmlRootElement annotation]
I have been reading this article over and over again, and really cant get a hold of the difference, and why it would be #1 vs #2...
It's not obvious, I'll grant you. It comes down to the type vs element decision.
When you have something like
<xs:element name="rpcType">
<xs:complexType>
This is essentially an "anonymous type", and is a type which can never occur anywhere other than inside the element rpcType. Because of this certainty, XJC knows that that type will always have the name rpcType, and so generates an #XmlRootElement annotation for it, with the rpcType name.
On the other hand, when you have
<xs:complexType name="rpcType">
then this defines a re-usable type which could potentially be referred to by several different elements. The fact that in your schema it is only referred to by one element is irrelevant. Because of this uncertainty, XJC hedges its bets and does not generate an #XmlRootElement.
The JAXB Reference Implementation has a proprietary XJC flag called "simple binding mode" which, among other things, assumes that the schema you're compiling will never be extended or combined with another. This allows it to make certain assumptions, so if it sees a named complexType only being used by one element, then it will often generate #XmlRootElement for it.
The reality is rather more subtle and complex than that, but in 90% of cases, this is a sufficient explanation.
Quite an involved question. There are many reasons to design schemas using types rather than elements (this approach is called the "venetian blind" approach versus "salami slice" for using global elements). One of the reasons is that types can be sub-typed, and another that it may be useful to only have elements global that can be root elements.
See this article for some more details on the schema side.
Now, as for the JAXB question in particular. The problem is that you created a class corresponding to a type and tried to serialise it. That means JAXB knows its content model, but not what the element name should be. You need to attach your RpcType to an element (JAXBElement), for example:
marshaller.marshal(new ObjectFactory().createRpc(myRpcType));
The ObjectFactory was placed into the package created by JAXB for you.
Advantages of Named Types
The advantage of a schema using global/named types is that child/sub types can be created that extend the parent type.
<xs:complexType name="rpcType">
<xs:sequence>
<xs:element ref="rpcOperation"/>
</xs:sequence>
<xs:attribute name="message-id" type="messageIdType" use="required"/>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
<xs:element name="rpc" type="rpcType"/>
The above fragment would allow the following child type to be created:
<xs:complexType name="myRPCType">
<xs:complexContent>
<xs:extension base="rpcType">
<xs:sequence>
<xs:element name="childProperty" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Impact on JAXB
Another aspect of named types is that they may be used by multiple elements:
<xs:element name="FOO" type="rpcType"/>
<xs:element name="BAR" type="rpcType"/>
This means that the schema to Java compiler cannot simply just pick one of the possible elements to be the #XmlRootElement for the class corresponding to "rpcType".

Resources