Reuse attributes on many elements in XSD - xsd

Is there a way in XML Schema to define/name a set of attributes that can be reused on many element definitions without cutting/pasting the whole attribute definition into each element definition?
For example, if I want three attributes on 20 different elements in my xsd, can I name these as a group somehow once and reference them without having to have the same full definition repeated as part of all 20 elements?
<xs:attribute name="lang" type="xs:string" use="required"/>
<xs:attribute name="page" type="xs:string"/>
<xs:attribute name="alphabet" type="xs:string"/>

Yes there is 'xs:attributeGroup' for this purpose.

Related

How to rename a attribute from a inherited type in XSD?

<xs:complexType name="country">
...
<xs:attribute name="zipcode" type="xs:string"/>
...
</xs:complexType>
<xs:complexType name="city">
<xs:complexContent>
<xs:extension base="country">
<xs:attributeRename from="zipcode" to="areaCode"/><!-- How to rename? -->
</xs:extension>
</xs:complexContent>
</xs:complexType>
My "city" type is derived from "country". I want the attribute be renamed after the derivation. So my 2 XML objects will both be valid. Of course, I know I could just not use inheritance. But there are 10+ other attributes they share, and I don't want any duplication.
<country areacode="123"></country>
<city zipcode="456"></city>
If you look at it from a context point of view, a city shouldn't inherit a country.
They could however both inherit a location type, which has all shared attributes and fields, and city and country could extend location for their own uses.
As far as I know, there is not way to rename or remove an attribute or a field from a type which you extend.

Is it possible to define a XSD element with an arbitrary name but with specific attributes

I would like to validate a custom XML document against a schema.
This document would include a structure with any number of elements, each having a specific attribute. Something like this:
<Root xmlns="http://tns">
<Records>
<FirstRecord attr='whatever'>content for first record</FirstRecord>
<SecondRecord attr='whatever'>content for first record</SecondRecord>
...
<LastRecord attr='whatever'>content for first record</LastRecord>
</Records>
</Root>
The author of the XML document can include any number of records, each with an arbitrary name of his or her choosing. How is this possible to validate this against an XML Schema ?
I have tried to specify the appropriate structure type in a schema, but I do not know how to reference it in the appropriate location:
<xs:schema xmlns="http://tns" targetNamespace="http://tns" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="RecordType"> <!-- This is my record type -->
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="attr" type="xs:string" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="1" name="Records">
<!-- This is where records should go -->
<xs:complexType />
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
What you describe is possible in XSD 1.1, and something very similar is possible in XSD 1.0, which is not to say it's an advisable design.
In XML vocabularies, the element type normally conveys relevant information about the type of information, and it is the name of the element type that is used to drive validation in most XML schema languages; the design you describe is (some would say) a little bit like asking if I can define an object class in Java, or a struct in C, which obeys the constraints that the members can have arbitrary names, as long as one of them is an integer with the value 42. That, or something like it, may well be possible, but most experienced designers will feel strongly that this is almost certainly not the right way to go about solving any normal problem.
On the other hand, doing unusual and awkward things with a system can sometimes help in learning how to use the system effectively. (You never know a system well, said a friend of mine once, until you have thoroughly abused it.) So my answer has two parts: how to come as close as possible to the design you specify in XSD, and alternatives you might consider instead.
The simplest way to specify the language you seem to want in XSD 1.1 is to define an assertion on the Records element which says (1) that every child of Records has an 'attr' attribute and (2) that no child of Records has any children. You'll have something like this:
...
<xs:element minOccurs="1" maxOccurs="1" name="Records">
<xs:complexType>
<xs:sequence>
<xs:any/>
</xs:sequence>
<xs:assert
test="every $child in * satisfies $child/#attr"/>
<xs:assert
test="not(*/*)"/>
</xs:complexType>
</xs:element>
...
As you can see, this is very similar to what InfantPro'Aravind' has described; it avoids the problems identified by InfantPro'Aravind' by using assertion, not type assignment, to impose the constraints you impose.
In XSD 1.0, assertion is not available, and the only way I can think of to come close to the design you describe is define an abstract element, which I'll call Record, as the child of Records, and to require that the elements which actually occur as children of Records be declared as being substitutable for this abstract type (which in turn requires that their types be derived from type RecordType). Your schema might say something like this:
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element name="Records">
<xs:complexType>
<xs:sequence>
<xs:element name="Record"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="RecordType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="attr"
type="xs:string"
use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Record"
type="RecordType"
abstract="true"/>
Elsewhere in the schema (possibly in a separate schema document) you will need to declare FirstRecord, etc., and specify that they are substitutable for Record, thus:
<xs:element name="FirstRecord" substitutionGroup="Record"/>
<xs:element name="SecondRecord" substitutionGroup="Record"/>
<xs:element name="ThirdRecord" substitutionGroup="Record"/>
...
At some level, this matches your description, though I suspect you did not want to have to declare FirstRecord, SecondRecord, etc.
Having described ways in which XSD can do what you describe, I should also say that I wouldn't recommend either of these approaches. Instead, I'd design the XML vocabulary differently to work more naturally with XSD.
In the design as you specify it, every record appears to have the same type, but in addition to the content of the element they are allowed to convey a certain additional quantity of information by having a different name (FirstRecord, SecondRecord, etc.). This additional information could just as easily be conveyed in an attribute, which would allow you to specify Record as a concrete element, rather than an abstract element, giving it an extra "alternate-name" attribute. Your sample data would then take a form like this:
<Root xmlns="http://tns">
<Records>
<Record
alternate-name="FirstRecord"
attr='whatever'>content for first record</Record>
<Record
alternate-name="SecondRecord"
attr='whatever'>content for first record</Record>
...
<Record
alternate-name="LastRecord"
attr='whatever'>content for first record</Record>
</Records>
</Root>
This will be more or less acceptable depending on whether you or your data providers or tools in your tool chain attach some mystic or other significance to having the string "FirstRecord" be an element type name instead of an attribute value.
Alternatively, one could say that the point of the design is to allow Records to contain an arbitrary sequence of elements of arbitrary structure (on this account, the restriction to xs:string is just an artifact of your example and is not really desired in reality) as long as we have, for each record, the information recorded in the 'attr' attribute. Easy enough to specify this: define 'Record' as a concrete element with an 'attr' attribute, accepting one child which can be any XML element:
<xs:element name="Record">
<xs:complexType>
<xs:sequence>
<xs:any processContents="lax"/>
</xs:sequence>
<xs:attribute name="attr"
use="required"
type="xs:string"/>
</xs:complexType>
</xs:element>
The value of the 'processContents' attribute can be changed to 'strict' or 'skip' or kept at 'lax', depending on whether you want FirstRecord, SecondRecord, etc. to be validated (and declared) or not.
I guess this is not possible with XSD alone!
When you say
any number of records, each with an arbitrary name of his or her choosing.
That forces us to use <xs:any/> element but! Having an element declared as any doesn't allow you to validate attributes under it..
So .. the answer is NO!

validating attribute values of xml elements

I am trying to write a xsd document that validates the following xml snippet.
<parentElement>
<someElement name="somethingRequired"/>
<someElement name="somethingElseRequired"/>
<someElement name="anything"/>
</parentElement>
It shall validate, if parentElement contains at least two occurences of someElement where one has an attribute name containing the value "somethingRequired" and the other has an attribute name containing the value "somethingElseRequired".
Is that possible?
Is that possible?
It depends on how specific your restrictions need to be. If it is good enough that all of the name attributes have unique values, then you can achieve this with <xs:unique>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="parentElement">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="2" maxOccurs="unbounded" name="someElement">
<xs:complexType>
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:unique name="uniqueName">
<xs:selector xpath="someElement" />
<xs:field xpath="#name" />
</xs:unique>
</xs:element>
</xs:schema>
You can also restrict the attribute values for example to some enumerated set of allowed values instead of just using type="xs:string". It is not possible though to restrict only the first two name attributes to be unique because the xpath attribute is not allowed to contain predicates.
If you need the first name attribute to have some specific value, the second one some other specific value and the rest to have any value, then I would say that this either violates the Schema Component Constraint Element Declarations Consistent or would need some sort of co-occurrence constraint that is more specific than <xs:unique> so generally it wouldn't be possible. You might be able to make it possible by using xsi:type attribute in the instance document to explicitly declare the type of the element.
Everything you want to do is possible except for validating based on the attribute values.

Schema Issue: Can define element type OR add element attribute, but not both. I want both!

I've inherited the task of creating a schema for some XML which already exists - and IMHO is not the best that could have been done. The section giving me problems is the element at the end of the 'scan-result' element.
The best I'm hoping for with regard to the data in the 'spectrum' element is to treat it as type="xs:string". I'll programatically divide up the numeric pairs that constitute the data in the string later. (Even though this step would not be needed had the data been properly structured in the first place.)
Here's a similar piece of XML data to what I have to work with...
<scan-result>
<spectrum-index>0</spectrum-index>
<scan-index>2</scan-index>
<time-stamp>5609</time-stamp>
<tic>55510</tic>
<start-mass>22.0</start-mass>
<stop-mass>71.0</stop-mass>
<spectrum count="5">30,11352;31,360;32,16634;45,1161;46,26003</spectrum>
</scan-result>
The problem is, I can't seem to get a working definition for the 'spectrum' element that has the 'count' attribute and allows me to define the 'spectrum' element type as "xs:string".
What I would like is something like the following:
<xs:complexType name="ctypScanResult">
<xs:sequence>
<xs:element name="spectrum-index" type="xs:integer"/>
<xs:element name="scan-index" type="xs:integer"/>
<xs:element name="time-stamp" type="xs:integer"/>
<xs:element name="tic" type="xs:integer"/>
<xs:element name="start-mass" type="xs:float"/>
<xs:element name="stop-mass" type="xs:float"/>
<xs:element name="spectrum" type="xs:string">
<xs:complexType>
<xs:attribute name="count" type="xs:integer"/>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="count" type="xs:integer"/>
</xs:complexType>
The problem is that I can define the type of the 'spectrum' element as "xs:string" XOR I can define the anonymous 'xs:complexType' in the 'spectrum' element, which allows me to insert the 'count' attribute. But I need to be able to express both.
Given that I'm kind of stuck with the XML as it was handed to me, is there a schema definition that will allow me to describe this data?
Sorry this is long, but thanks to any and all who respond,
AlarmTripper
Followup: I know why the error occurs...
Quoted from W3C:
3.3.3 Constraints on XML Representations of Element Declarations
Schema Representation Constraint: Element Declaration Representation OK
In addition to the conditions imposed on element information items by the schema for schemas: all of the following must be true:
1 default and fixed must not both be present.
2 If the item's parent is not , then all of the following must be true:
2.1 One of ref or name must be present, but not both.
2.2 If ref is present, then all of , , , , , nillable, default, fixed, form, block and type must be absent, i.e. only minOccurs, maxOccurs, id are allowed in addition to ref, along with .
3 type and either or are mutually exclusive.
4 The corresponding particle and/or element declarations must satisfy the conditions set out in Constraints on Element Declaration Schema Components (§3.3.6) and Constraints on Particle Schema Components (§3.9.6).
But I'm still in the same fix I was before... How can I actually accomplish something that resembles my goal?
Thanks,
AlarmTripper
Let a tool do it for you! Try xsd.exe.
Or, if you must define by hand, at least check your hand-written-definition with an automatically generated one.
Here's what XSD.exe gave me for your input. I trimmed out some MS-NS cruft.
<xs:element name="spectrum">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="count" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
You need to set the attribute mixed="true" on complexType:
<xs:element name="spectrum">
<xs:complexType mixed="true">
<xs:attribute name="count" type="xs:integer" />
</xs:complexType>
</xs:element>
EDIT: Okay, just read your comment, sorry. I believe the following should work instead:
<xs:element name="spectrum">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="count" type="xs:integer" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="spectrum" type="xs:string">
<xs:complexType>
<!-- ADD THIS NEXT LINE -->
<xs:complexContent mixed="true"/>
<xs:attribute name="count" type="xs:integer"/>
</xs:complexType>
</xs:element>

How to declare a non-string element as having optional content in XML Schema

I have seen XML Schema element with attributes containing only text but I have an element that's an xs:dateTime instead.
The document I'm trying to write a schema for looks like this:
<web-campaigns>
<web-campaign>
<id>1231</id>
<start-at nil="true"/>
</web-campaign>
<web-campaign>
<id>1232</id>
<start-at>2009-08-08T09:00:00Z</start-at>
</web-campaign>
</web-campaigns>
Sometimes the xs:dateTime element has content, sometimes it doesn't.
What I have so far (which doesn't validate yet) is:
<xs:element name="start-at">
<xs:complexType mixed="true">
<xs:simpleContent>
<xs:extension base="xs:dateTime">
<xs:attribute name="nil" default="false" type="xs:boolean" use="optional" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
If I replace xs:dateTime with xs:string, I can validate the document just fine, but I really want an xs:dateTime, to indicate to consumers what's in that element. I tried with/without mixed="true" as well, to no avail.
If it makes a difference, I validate using xmllint (on Mac OS X 10.5) and XML Schema Validator
you can define your own types as union of types.
1/ define the "empty" type as a string that only allows "" ähm nothing :)
<xs:simpleType name="empty">
<xs:restriction base="xs:string">
<xs:enumeration value=""/>
</xs:restriction>
</xs:simpleType>
2/ next define a type that allows date AND empty
<xs:simpleType name="empty-dateTime">
<xs:union memberTypes="xs:dateTime empty"/>
</xs:simpleType>
3/ declare all your nullable datetime elements as type="empty-dateTime"
You need
<xs:element name="start-at" minOccurs="0">
mixed-mode isn't relevant to your situation, you don't need that. By default, minOccurs="1", i.e. the element is mandatory.
With minOccurs="0", you either specify the element with content, or not at all. If you want to be able to permit <start-at/>, then you cannot use xs:dateTime.

Resources