How to restrict an element to be non-empty through xsd - xsd

I need to restrict the elements in xml file to be non empty using xsd files. Can I force the elements to contain only CDATA sections?

The only tool you have is the xs:string pattern restriction as in below. CDATA is just an alternative to escaping with entity references. You should use entity references in your pattern.
<simpleType name="NewType2">
<restriction base="string">
<minLength value="5"></minLength>
<maxLength value="30"></maxLength>
<pattern value="(<html>).*(</html>)"></pattern>
</restriction>
</simpleType>

Related

XML Schema: Ignore tags with foreign namespace

Say I have the following xml document:
<root xmlns:p="uri:myNamespace">
<p:tagA>
<p:tagB />
</p:tagA>
</root>
The tagB must only be inside a tagA. I can write an xsd that validates that:
<xsd:schema ... targetNamespace="uri:myNamespace" elementFormDefault="qualified">
<xsd:element name="tagA">
<xsd:complexType>
<xsd:element name="tagB" type="..." />
</xsd:complexType>
</xsd:element>
</xsd:schema>
Now here comes the problem: I want to ignore any tags in between of foreign namespace:
<root xmlns:p="uri:myNamespace">
<p:tagA>
<whatever />
<foo>
<bar>
<p:tagB />
</bar>
</foo>
</p:tagA>
</root>
As you can see tagB is now nested within other tags without namespace.
Is it possible (how?) to write an XSD that still enforces that the only tag within tagA from my namespace is a tagB but there may be any tags of other namespaces inbetween?
The content models used in XSD (and DTDs, and Relax NG) to constrain the content of an element define legal sequences of children; they work like a single production rule in a context-free grammar. It's possible to constrain descendants at deeper levels, but it requires an unbroken chain of declarations: in your example you need declarations for foo and bar when they appear within a p:tagA element, to ensure that between them they contain exactly one p:tagB element. But your starting point is that you don't want to constrain those elements.
So: you cannot use content models to express the constraint you have in mind.
In XSD 1.1, you can use an assertion attached to the p:tagA element to require that it contain exactly on p:tagB element among its descendants (count(.//p:tagB) eq 1). You cannot, however, use an assertion attached to p:tagB to require that it appear only in p:tagA elements: assertions can look down, but not up, in the tree. (If you know the name of a container guaranteed to be present, you can use an assertion on that container that asserts that every p:tagB element is contained by a p:tagA element, using an assertion like count(.//p:tagA//p:tagB) eq count(.//p:tagB).)
XSD 1.1 is currently supported by some but not all XSD validators.

writing pattern in xsd for <a attributes><img/></a>

I am trying to apply pattern for an element in xsd.
Element is of type XHTML.
I want to apply pattern like this.
<a attributes="some set of attributes"><img attributes="some set of attribtes"/></a>
Rules:
<a> tag with attributes followed by <img> with attributes.
Sample Valid Data:
<a xlink:href="some link" title="Image" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/1999/xhtml">
<img alt="No Image" title="No Image" xlink:href="soem path for image" xlink:title="Image" xmlns="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink" />
</a>
Invalid:
<a>data<img/></a>--Data Present, no attributes
<a><img>abcd</img></a>--data Present, No attributes
<a><img/></a>---No attributes
Can any one suggest how to write pattern for this.
<xsd:restriction base="xs:string">
<xs:pattern value="Need help"/>
</xsd:restriction>
Thank you.
The XSD pattern facet is used to constrain the 'lexical space' of a simple type (that is, the set of literal strings that denote instances of the type) using regular expressions. It won't help you to require that certain elements must have attributes.
If you want specific attributes to be present (e.g. title and xlink:href on the a element, title and alt on the img element), the simplest way to do so in a schema is by declaring those attributes as required. The schema for XHTML 1.0 strict, for example (at http://www.w3.org/TR/xhtml1-schema/#xhtml1-strict) declares both src and alt as being required on img:
<xs:element name="img">
<xs:complexType>
<xs:attributeGroup ref="attrs"/>
<xs:attribute name="src" use="required" type="URI"/>
<xs:attribute name="alt" use="required" type="Text"/>
...
</xs:complexType>
</xs:element>
If what you want is just to require that some attributes be used, but you don't care which, XSD doesn't make the task easy: in XSD 1.0 there is no convenient way to say "I don't care which attribute appears, but there has to be one". You can enforce such a constraint (even if some observers like me find it a bit odd) by using assertions in XSD 1.1, or by using a Schematron schema in addition to your XSD schema.

using xsd attributes in another xsd as tag/tag-parameters

I wanted to know if it is possible or not:
a.xsd :
<attribute name="aa" type="String">
b.xsd
<element name="bb" aa="pan" type="string"/>
or
<aa name="pan" type="string">
basically i am trying to find out if we can use the xsd attributes inside another xsd as tags or tag-parameters.
i am new to this xsd world if it is wrong use-case then also please post your views.
I am not clear on what you are trying to accomplish, but the xsd:element element has a set of recognized attributes. It won't do you any good to try to make up attributes to add to it.

How to force an element to contain only CDATA through XSD

My requirement is that any Xml file that will be validated against my schema should conform to following condition.
The OTHERWISE element can contain only CDATA section and nothing else.
Example
Valid XML: <OTHERWISE ContentURI=""><![CDATA[<html>Good-bye</html>]]></OTHERWISE>
Invalid XML: <OTHERWISE ContentURI="">ABC</OTHERWISE>
I am trying the following:
<xs:simpleContent>
<xs:restriction base="OtherwiseAtt">
<xs:pattern value="^<\!\[CDATA\[[a-zA-Z0-9]*\]\]>" />
</xs:restriction>
</xs:simpleContent>
Any thing can go inside the CDATA. I have put [a-zA-Z0-9]* just for testing purpose.
Please help me out.
Thanks
Sabri
The content between <![CDATA[ and ]]> is handled by the parser. Your XML file has been fully parsed by the time that it is validated. CDATA is basically another way to escape special characters. The validator will not have a way to determine if an element contains CDATA or not in the way that you wish.
The purpose of validation is to place controls on the structure of your documents. It is not and cannot enforce a particular method of escaping text.
Why would you need to require that the content is escaped by CDATA? This sounds like an attempt to handle a poor design choice at an earlier stage.

In an XSD schema, how do I say that an element might have any number of subelements that must inherit from a certain type?

Say I have these types defined in my XSD:
<complexType name="NamedEntity">
<attribute name="ix" type="positiveInteger"></attribute>
<attribute name="sName" type="string"></attribute>
<attribute name="txtDesc" type="string"></attribute>
</complexType>
<complexType name="Node">
<complexContent>
<extension base="tns:NamedEntity">
</extension>
</complexContent>
</complexType>
<complexType name="Source">
<complexContent>
<extension base="tns:NamedEntity">
<attribute name="dt" type="dateTime"></attribute>
</extension>
</complexContent>
</complexType>
Now I want to express that a Node element may have zero or more child elements that may be of the type Node or Source.
It would be OK if I had to somehow enumerate the allowed types for the children, but since I have more types that inherit from NamedEntity, it would be neat if I could specify just the base type.
Edit: I'd rather not use xsi:type in the document but have a unambigous relationship between element name and type. Quite a lot XML processing seems to depend on that, and I also find it a lot more readable.
Please don't use xsi:type if you can avoid it. It's evil. Ok, maybe I exaggerate, but it does make it impossible to parse the document without intimate knowledge of the schema, which is bad enough in practice.
What will help you is: substitutionGroup.
In the schema, have the Node element contain zero or more child elements of type NamedEntity. In the actual document, use the xsi:type attribute (xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance") to choose the subtype ("Node" or "Source") for each one.
This may be beyond the capabilities of XSD. Have you considered doing extra validation using Schematron?
I think you want a substitution group.

Resources