Require that an element has another element as a descendant - xsd

Is there a way in an xsd schema to require that an element have another element somewhere as a descendant?
For example, element parent requires a descendant desc. This is valid:
<parent>
<a>
<b>
<desc></desc>
</b>
</a>
</parent>
As is this:
<parent>
<c>
<desc></desc>
</c>
</parent>
but this isn't:
<a>
<parent>
<b/>
</parent>
</a>
The potential child elements for parent are many and complicates, so it would be difficult to enumerate every possible valid configuration.
Something like the key/selector schema elements seems like it would work, where I could provide an xpath expression defining the valid locations for desc element, but all of the examples I've found are aimed at matching up the value of attributes.

No, (almost) all XML Schema validation is shallow, called "local" in the spec. Here's one excerpt that emphasizes type validation as "local" validation.
Element Validated by Type If an
element information item is ·valid·
with respect to a ·type definition· as
per Element Locally Valid (Type)
(§3.3.4), [it is marked as] ·validated·.
The only exception is for the identity constraints like uniqueness and key-references which have a broad scope in an XML document but narrow uses.

I don't know if XSD supports what you are trying to do, but there is a work-around.
You could do complex validations with a two-step process:
First simply use your XSD schema for basic validation
Next use an XSLT which does more complex validations, and outputs the result of that validation
This may not plug in well to whatever framework you are working with, but might work well for (partially) custom code. It also has the advantage (over doing the extra validations in code) that you can publish both documents.
From a quick google search, one effort towards this end is Schematron. It actually foregos XSD entirely, and just uses XSLT. It appears to be a published standard:
http://www.schematron.com/

Related

Difference between XSD Simple element and XSD Complex element

I Googled this Question but still i'm unable to find the best difference for the Simple XSD (XML Schema Definition) Element and Complex XSD Element.
Any guidance would be highly appreciated.
I have no idea, why I answer this. But...
To summarize,
simple types can only have content directly contained between the element’s opening and closing tags. They cannot have attributes or child elements.
complex types can have attributes, can contain other elements, can contain a mixture of elements and text, etc etc.
One is a single value and the other a compound value.

Element dependency on sibling element value

I need to define a validation rule under XML Schema 1.0 that allows an element to occur (once) among a set of sibling elements only if another specific sibling has a certain value.
For example, given the instance XML document snippet,
<root>
<parent>
<child1>A</child1>
</parent>
<parent>
<child1>B</child1>
<chlld2>C</child2>
</parent>
</root>
I'd like the rule to allow the child2 element to occur only if the required child1 element has a value of 'B', otherwise, the child1 element should occur by itself under a given parent.
This is quite easy to achieve under XML Schema 1.1 using an xs:assert, but the solution under version 1.0 evades me.
Any insights are most appreciated.
The usual approach in XSD 1.0 is to design the XML differently: if we have one particular value B for child1 which makes the occurrence of child2 possible, then we can split child1 into two element types: child1-notB and child1-B. And since in the case of child1-B we know the value, the value doesn't actually need to be present. The XML becomes:
<root>
<parent>
<child1-notB>A</child1-notB>
</parent>
<parent>
<child1-B/>
<chlld2>C</child2>
</parent>
</root>
It's simple to write a content model in which the parent element contains either a child1-notB or a child1-B followed by an optional child2.
As Dijkgraaf has already observed, the specific design you describe cannot be expressed in XSD 1.0. XSD 1.1 added assertions in part because so many people want designs like the one you describe, in which two quite different elements, which have quite different effects on what is and is not allowed, are nevertheless given the same name so as to mask their difference in meaning, instead of being called by different names to make their difference in semantics explicit.

Making subtags dependent on an attribute of a parent in XML Schema

I have created an XML file like the following
<monitor>
<widget name="Widgets/TestWidget1">
<state code="VIC" />
<state code="TAS" />
</widget>
<widget name="Widgets/TestWidget2">
<client code="someclient" />
</widget>
</monitor>
The name attribute of the <widget> tag tells the parser what widget to load (they are asp.net user controls).
I am trying to create a schema file for the above, the problem is that inside the <widget> the supported subtags are dependent on the name attribute. So TestWidget1 supports the <state> tag and TestWidget2 supports the <client tag.
Currently my XML Schema file just displays all possible <widget> subtags regardless of whether they are supported or not.
How can I write an XML schema file that will only allow specific subtags based on the name attribute? If this is not possible, what options do I have?
You have several options. The simplest and most direct is to re-think your problem a bit. If the legal content of element E1 and the legal content of element E2 are different, then the simplest design is to call them different things, because in XSD as in DTDs the legal content of an element depends on the element type name. A devil's advocate would ask you "if you want different kinds of widget to obey different rules, why are you telling the validator that they are the same kind of widget? Tell the validator the truth, by giving them different names. So don't call them and so on, call them and ."
In XSD 1.1 you can also use conditional type assignment or assertions to define constraints on the legal combinations of attributes and children, but not every schema-aware editor is going to have the chops necessary to analyse the conditional type assignment rules and attributes and understand what to prompt you with.

An XSD attribute to capture "source data field"

I have a domain model which is intended to generalise several source systems. As such, in certain cases the decision was made to overload data into new a generic field rather than to create several specific fields.
To account for this, when the source systems data is mapped onto the new domain model, I was hoping to record the source fieldname as an attribute, e.g.:
<Event>
<Description sourceField="subject">...</Description>
<Description sourceField="description">...</Description>
<Description sourceField="issue">...</Description>
<...>
</Event>
What would be the appropriate way to add such an attribute into the XSD? Would I need to specifically attach it to every such overloaded field, or is there a general way to allow an attribute across all elements?
Please don't point out that I should just add the extra fields into the domain model if I need to distinguish between the different data - the decision has been made, I just need to work around it!
Thanks in advance.
Not really.
If all your element declarations extend from a common base type definition, then you can add the attribute to the base.
If all your element declarations include an anyAttribute, you can make a global attribute definition for sourceField. Then the validator would at least allow your attribute but not require it. And if the anyAttribute is strict or lax the validator will make sure the attribute's content is valid.

Can you use key/keyref instead of restriction/enumeration in XML schema?

Suppose we have a stylesheet which pulls in metadata using the key() function. In other words we have instance documents like this:
<items>
<item type="some_type"/>
<item type="another_type"/>
</items>
and a table of additional data we would like to associate with items during processing:
<item-meta>
<item type="some_type" meta="foo"/>
<item type="another_type" meta="bar"/>
<item type="yet_another_type" meta="baz"/>
</item-meta>
Finally, suppose we want to do schema validation on the instance document, restricting the type attributes to the set of types which occur in item-meta. So in the schema we want to use key/keyref instead of restriction/enumeration. This is because using restriction/enumeration will require making a separate list of valid type attributes.
However, it doesn't look like key/keyref will actually work. Having tried it (with MSXML 6.0) it appears the selector of a schema key won't accept the document() function in its xpath argument, so we can't examine the item-meta data, whether it appears in an external file or in the schema file itself. It looks like the only place we can look for keys is the instance document.
So if we really don't want to have a separate list of valid types, we have to do a pre-validation transform, pulling in the item-meta stuff, then do the validation, then do our original transform. That seems overcomplicated for what ought to be a relatively straightforward use of XML schema and stylesheets.
Is there a better way?
Selectors in key/keyref allow only a very restricted xpath syntax. Short, but not completely accurate: The selector must point to a subnode of the element declared.
The full definition of the restricted syntax is -> here.
So, no I don't see a better way, sorry.
BTW: The W3C states that this restriction was made to make life easier on implementers of XML Schema processors. Keep in mind that one of the design goals of XML Schema was to make it possible to process a document in streaming mode. That explains really a lot of the sometimes seemingly random restrictions of XML Schema.
Having thought about it a little more, I came up with the idea of having the stylesheet do that part of the validation. The schema would define the item type as a plain string, and the stylesheet would emit a message and stop processing if it couldn't look up the item type in the item-meta table.
This solution fixes the original problem of having to write down the list of valid types more than once, but it introduces the problem that validation logic is now mixed in with the stylesheet logic. I don't have enough experience with XSD+XSLT to tell whether this new problem is less serious than the old one, but it seems to be more elegant than what I wrote earlier about pulling the item-meta table into each instance document in a pre-validation transform.
You wouldn't need to stop the XSLT with some error. Just let it produce something that the schema won't validate and that points to the original problem like
<error txt="Invalid-Item-Type 'invalid_type'"/>
Apart from that please keep in mind that there are no discussion threads here. The posts may go up and down, so it's better to edit your question accordingly.
Remember, the philosophy here is "One question, and the best answer wins".

Resources