validating attribute values of xml elements - attributes

I am trying to write a xsd document that validates the following xml snippet.
<parentElement>
<someElement name="somethingRequired"/>
<someElement name="somethingElseRequired"/>
<someElement name="anything"/>
</parentElement>
It shall validate, if parentElement contains at least two occurences of someElement where one has an attribute name containing the value "somethingRequired" and the other has an attribute name containing the value "somethingElseRequired".
Is that possible?

Is that possible?
It depends on how specific your restrictions need to be. If it is good enough that all of the name attributes have unique values, then you can achieve this with <xs:unique>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="parentElement">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="2" maxOccurs="unbounded" name="someElement">
<xs:complexType>
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:unique name="uniqueName">
<xs:selector xpath="someElement" />
<xs:field xpath="#name" />
</xs:unique>
</xs:element>
</xs:schema>
You can also restrict the attribute values for example to some enumerated set of allowed values instead of just using type="xs:string". It is not possible though to restrict only the first two name attributes to be unique because the xpath attribute is not allowed to contain predicates.
If you need the first name attribute to have some specific value, the second one some other specific value and the rest to have any value, then I would say that this either violates the Schema Component Constraint Element Declarations Consistent or would need some sort of co-occurrence constraint that is more specific than <xs:unique> so generally it wouldn't be possible. You might be able to make it possible by using xsi:type attribute in the instance document to explicitly declare the type of the element.

Everything you want to do is possible except for validating based on the attribute values.

Related

the attribute 'type' and a xs:complexType local declaration are mutually exclusive [duplicate]

I am very (very) new to XSD and XML in general so I have begun by generating my XSD schema using an online tool from my proposed XML document. Part of the XML is:
<code base="16">2A</code>
The XSD generated by the tool for this is:
<xs:element name="code" type="xs:int">
<xs:complexType>
<xs:attribute name="base" type="xs:int"></xs:attribute>
</xs:complexType>
</xs:element>
Which seems logical enough to me; there is an element named code that has values of type int, the element can have an attribute with a name of base who's value is also of type int.
Yet when I open this in Visual Studio, it complains about xs:element tag because:
The type attribute cannot be present with either simpleType or complexType
Which unfortunately also seems logical, if the type is defined in the element, why is there further definition of the type with the complexType tag?
Thus the crux of my problem; how to properly define an element that has a typed value and also a typed attribute, in XSD?
I have seen this question: The type attribute cannot be present with either simpleType or complexType but it was just different enough that I, with my n00b-level knowledge of XSD, did not understand the answer at all.
The suggested duplicate Error: The element has a type attribute as well as an anonymous child type is about attributes with nested elements... this questing is about an attribute on an element that is not nested but that has type on both attribute and element (if that's the same thing, my level of knowledge is too low to grasp it and so I still need further explanation)
The links you mention address the direct error by which you cannot specify both a xs:element/#type="xs:int" attribute and a xs:element/xs:complexType child element (anonymous type) concurrently.
What remains to explain is how you actually can achieve your goal to write an XSD that would validate your XML,
<code base="16">2A</code>
The following XSD will do so:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="code">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="base" type="xs:int"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>

Why does the validation of keyref depend on the ordering of the key element?

My document contains A elements with IDs and B Elements which reference the As, like this:
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:\\\refissue.xsd">
<A id="x"/>
<A id="y"/>
<B><Aref idref="x" /></B>
</root>
When I validate against my simple schema (see below) I get the following error:
cvc-identity-constraint.4.3: Key 'ref' with value 'x' not found for identity constraint of element 'root'.
If I change the ordering of the A element to
<A id="y"/>
<A id="x"/>
the document validates without any errors.
Why does the validation result depend on the ordering of the elements?
Is this a bug in the validator or in my schema?
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="A">
<xs:complexType>
<xs:attribute name="id" type="xs:ID" />
</xs:complexType>
<xs:key name="A.KEY">
<xs:selector xpath="." />
<xs:field xpath="#id" />
</xs:key>
</xs:element>
<xs:element maxOccurs="unbounded" name="B">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="1" name="Aref">
<xs:complexType>
<xs:attribute name="idref" type="xs:IDREF" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:keyref name="ref" refer="A.KEY">
<xs:selector xpath="B/Aref" />
<xs:field xpath="#idref" />
</xs:keyref>
</xs:element>
</xs:schema>
I tried the validation with Eclipse (which uses xerces, I think), xerces-c 3.1.1, xmlstarlet 1.5.0 and libxml2 2.7.8 and I get the error only with eclipse and xerces.
You're right, validity against an identity constraint should not depend on the order of elements in the input.
Here I think the problem is that the schema is not quite right, and Xerces is having trouble generating a useful diagnosis of the problem. (The fact that libxml doesn't report an error is just a consequence of its incomplete coverage of XSD.)
Your key constraint should be defined on the scope of the element within which the key values need to be unique -- so on the root element, not on the A element. (As defined, your A.KEY constraint requires that the string value of each A element be unique within that A element, which will always be the case. The fact that the id attribute is declared as being of type xs:ID does require uniqueness, of course. And similarly, the fact that the Aref idref attribute is declared as being of type xs:IDREF means that your key and keyref declarations are not actually doing much work here that's not already being done by ID and IDREF.)
Once you move the declaration of A.KEY to the declaration of the root element, Xerces and Saxon agree that the schema is OK and the document is valid.
I had a similar problem in Eclipse until the xs:key and the xs:keyref were both explicitly set to the same type. In my case I set to both to xs:string(I also was using xs:unique and a keyref reference to the unique but it seems to work the same way for key and keyref pairs).
So for example if the key is based on an element that looks like this:
<xs:complexType name="elementTypeWithKey'>
<xs:attribute name="theKey" type="xs:string"/>
</xs:complexType>
and the theKey attribute is explicitly xs:string, make sure that the attribute used as a keyRef is also explicitly xs:string:
<xs:complexType name="elementTypeWithKeyRef">
<xs:attribute name="theKeyRef" type="xs:string"/>
</xs:complexType>

Is it possible to define a XSD element with an arbitrary name but with specific attributes

I would like to validate a custom XML document against a schema.
This document would include a structure with any number of elements, each having a specific attribute. Something like this:
<Root xmlns="http://tns">
<Records>
<FirstRecord attr='whatever'>content for first record</FirstRecord>
<SecondRecord attr='whatever'>content for first record</SecondRecord>
...
<LastRecord attr='whatever'>content for first record</LastRecord>
</Records>
</Root>
The author of the XML document can include any number of records, each with an arbitrary name of his or her choosing. How is this possible to validate this against an XML Schema ?
I have tried to specify the appropriate structure type in a schema, but I do not know how to reference it in the appropriate location:
<xs:schema xmlns="http://tns" targetNamespace="http://tns" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="RecordType"> <!-- This is my record type -->
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="attr" type="xs:string" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="1" maxOccurs="1" name="Records">
<!-- This is where records should go -->
<xs:complexType />
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
What you describe is possible in XSD 1.1, and something very similar is possible in XSD 1.0, which is not to say it's an advisable design.
In XML vocabularies, the element type normally conveys relevant information about the type of information, and it is the name of the element type that is used to drive validation in most XML schema languages; the design you describe is (some would say) a little bit like asking if I can define an object class in Java, or a struct in C, which obeys the constraints that the members can have arbitrary names, as long as one of them is an integer with the value 42. That, or something like it, may well be possible, but most experienced designers will feel strongly that this is almost certainly not the right way to go about solving any normal problem.
On the other hand, doing unusual and awkward things with a system can sometimes help in learning how to use the system effectively. (You never know a system well, said a friend of mine once, until you have thoroughly abused it.) So my answer has two parts: how to come as close as possible to the design you specify in XSD, and alternatives you might consider instead.
The simplest way to specify the language you seem to want in XSD 1.1 is to define an assertion on the Records element which says (1) that every child of Records has an 'attr' attribute and (2) that no child of Records has any children. You'll have something like this:
...
<xs:element minOccurs="1" maxOccurs="1" name="Records">
<xs:complexType>
<xs:sequence>
<xs:any/>
</xs:sequence>
<xs:assert
test="every $child in * satisfies $child/#attr"/>
<xs:assert
test="not(*/*)"/>
</xs:complexType>
</xs:element>
...
As you can see, this is very similar to what InfantPro'Aravind' has described; it avoids the problems identified by InfantPro'Aravind' by using assertion, not type assignment, to impose the constraints you impose.
In XSD 1.0, assertion is not available, and the only way I can think of to come close to the design you describe is define an abstract element, which I'll call Record, as the child of Records, and to require that the elements which actually occur as children of Records be declared as being substitutable for this abstract type (which in turn requires that their types be derived from type RecordType). Your schema might say something like this:
<xs:element name="Root">
<xs:complexType>
<xs:sequence>
<xs:element name="Records">
<xs:complexType>
<xs:sequence>
<xs:element name="Record"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="RecordType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="attr"
type="xs:string"
use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:element name="Record"
type="RecordType"
abstract="true"/>
Elsewhere in the schema (possibly in a separate schema document) you will need to declare FirstRecord, etc., and specify that they are substitutable for Record, thus:
<xs:element name="FirstRecord" substitutionGroup="Record"/>
<xs:element name="SecondRecord" substitutionGroup="Record"/>
<xs:element name="ThirdRecord" substitutionGroup="Record"/>
...
At some level, this matches your description, though I suspect you did not want to have to declare FirstRecord, SecondRecord, etc.
Having described ways in which XSD can do what you describe, I should also say that I wouldn't recommend either of these approaches. Instead, I'd design the XML vocabulary differently to work more naturally with XSD.
In the design as you specify it, every record appears to have the same type, but in addition to the content of the element they are allowed to convey a certain additional quantity of information by having a different name (FirstRecord, SecondRecord, etc.). This additional information could just as easily be conveyed in an attribute, which would allow you to specify Record as a concrete element, rather than an abstract element, giving it an extra "alternate-name" attribute. Your sample data would then take a form like this:
<Root xmlns="http://tns">
<Records>
<Record
alternate-name="FirstRecord"
attr='whatever'>content for first record</Record>
<Record
alternate-name="SecondRecord"
attr='whatever'>content for first record</Record>
...
<Record
alternate-name="LastRecord"
attr='whatever'>content for first record</Record>
</Records>
</Root>
This will be more or less acceptable depending on whether you or your data providers or tools in your tool chain attach some mystic or other significance to having the string "FirstRecord" be an element type name instead of an attribute value.
Alternatively, one could say that the point of the design is to allow Records to contain an arbitrary sequence of elements of arbitrary structure (on this account, the restriction to xs:string is just an artifact of your example and is not really desired in reality) as long as we have, for each record, the information recorded in the 'attr' attribute. Easy enough to specify this: define 'Record' as a concrete element with an 'attr' attribute, accepting one child which can be any XML element:
<xs:element name="Record">
<xs:complexType>
<xs:sequence>
<xs:any processContents="lax"/>
</xs:sequence>
<xs:attribute name="attr"
use="required"
type="xs:string"/>
</xs:complexType>
</xs:element>
The value of the 'processContents' attribute can be changed to 'strict' or 'skip' or kept at 'lax', depending on whether you want FirstRecord, SecondRecord, etc. to be validated (and declared) or not.
I guess this is not possible with XSD alone!
When you say
any number of records, each with an arbitrary name of his or her choosing.
That forces us to use <xs:any/> element but! Having an element declared as any doesn't allow you to validate attributes under it..
So .. the answer is NO!

Reuse attributes on many elements in XSD

Is there a way in XML Schema to define/name a set of attributes that can be reused on many element definitions without cutting/pasting the whole attribute definition into each element definition?
For example, if I want three attributes on 20 different elements in my xsd, can I name these as a group somehow once and reference them without having to have the same full definition repeated as part of all 20 elements?
<xs:attribute name="lang" type="xs:string" use="required"/>
<xs:attribute name="page" type="xs:string"/>
<xs:attribute name="alphabet" type="xs:string"/>
Yes there is 'xs:attributeGroup' for this purpose.

xml scheme attribute or complextype conditional

I would like to allow for an element to either contain an attribute OR define a more complex type.
Something like
<myElement someAttr="..."/>
or
<myElement>
<...>
</myElement>
That is, if someAttr exists then I do not want to allow sub elements and if it doesn't then I want to.
The reason for this is I want to have an "include" feature where I include a file which is essentially inserted into the element. But I don't want both. You can either include additional external xml code into the element or add your own BUT not both. (or also to have it inserted from a separate part of the xml)
This is mainly for simplifying a complex xml so that the structure is easily understood.
I doubt you will be able to express something like that in XML schema at this point.
You can make an attribute optional, e.g. it can be present or not. But you cannot express something like if the attribute is not present, then include other complex content with the current means.
You'll have to either check this programmatically yourself, or maybe investigate if other XML description languages like RelaxNG or Schematron might be able to help.
Perhaps with a static choice and you change the myElement name ?
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root">
<xs:complexType>
<xs:choice>
<xs:element name="myElementWithAttrs">
<xs:complexType>
<xs:attribute name="someAttrs" type="xs:string"/>
</xs:complexType>
</xs:element>
<xs:element name="myElementWithoutAttrs">
<xs:complexType>
<xs:sequence>
<xs:any processContents="skip"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>

Resources