Schema of schemas - xsd

Is it possible to use xml schema to express somme rules for other schemas ?
I've read XML Schema to validate XML Schemas? and it's very interesting, but I should want to verify some applicative rules ; for instance, in a schema is there a version number to the xs:schema element ? Is there a use attribute to the xs:attribute element ? Etc.
Is there some good practices for that ?
Thanks.

This is an almost perfect application for Schematron. I can't give you a full schematron for testing that sort of rule but you would be looking at something like:
<pattern id="attribute-checks">
<rule context="xs:attribute">
<assert test="#use">All xs:attribute elements must have a use attribute</assert>
</rule>
</pattern>
Schematron allows you express rules driven for any xml file (obviously, including schemas) that go beyond simple grammar rules. You can use them to extend validation into the level of business rules.

Related

How to parse an XSD file with RapidXML

Does RapidXML have the capability to validate/parse a XML file with its associated schema, i.e. XSD file? I was under the assumption that an XML parser would have the capability to do both congruently. If not, why is it deemed unnecessary to validate/parse the associated schema? I checked RapidXML's documentation and found no mention of schema or xsd.
I am currently parsing XML files likeso:
rapidxml::file<> xmlFile("BeerLog.xml");
rapidxml::xml_document<> doc;
doc.parse<0>(xmlFile.data());
The following sudo-code might give you a better idea of what I am looking for:
rapidxml::file<> xmlFile("BeerLog.xml", "BeerLog.xsd");
or even:
rapidxml::file<> xmlFile("BeerLog.xml");
rapidxml::file<> xsdFile("BeerLog.xsd");
rapidxml::xml_document<> doc;
doc.parse_with_schema<0>(xsdFile.data(), xmlFile.data());
Your impression is wrong, accessing the content of a XML and validation against a scheme are quite distinct topics- even if the former is useful for the latter. Especially light-wight and fast parsers don't support validation, and a quick glance into the documentation shows this:
W3C Compliance. RapidXml is not a W3C compliant parser, primarily because it ignores DOCTYPE declarations
Given also, that there are quite different scheme languages (XSD, RNG, DTD, ...) even support of one would not mean its the one you would like to.
You will also have to take into account, that there are many XML files, which are just well-formed and do not conform to any scheme - somebody may want to process them nevertheless.

Tool to refactor .xsd schema?

I got an .xsd scheme that has multiple root elements, few complex and simple types, complex types refer to those root elements. I can't generate .xml in a way i want from it because of those root elements. So i think i need to add element that will serve as the root element, and add all other elements as its children, or am i wrong?
Is there a tool that can help me create root element and refactor scheme? What i got:
I was thinking maybe i just need to create another xsd with 1 element and ref all elements form first xsd in it, but i don't exactly know how to do it. Is this a good idea?
The answer to your edit is indeed, as Pangea said, NO. And that is because to ref another element (as in <xsd:element ref="SomeElement" ... />), the referenced element must be declared globally.
Other scenario that requires global elements is the use of substitution groups. What I am trying to suggest is that it may not always be possible to refactor an XSD in a way that leaves global only the elements you want as root in instance XML.
Which should make you think that to solve your problem, a better way might be to go after the reason why you can't generate .xml the way you want. If you can describe a bit of that, you might get a better answer here...
Another reason I wanted to add this answer was that I noticed the use of XML as a tag name. While it may seem OK, I can tell you that I've seen some pretty "big-name" applications that would simply choke with that <XML/> tag name. XML is actually "reserved", please take a look at this section of the XML Spec. To quote: "Names beginning with the string "xml", or with any string which would match (('X'|'x') ('M'|'m') ('L'|'l')), are reserved for standardization in this or future versions of this specification."
Always play nice with the specs....
Any global element you define in the schema is a potential root element in the the instance document. If you doesn't want this behavior then make sure you have only one global element definition in the xsd. It has nothing to do with the tool (though xml editors can simplify this).

All mandatory field in a xsd file?

Is there a quick way to find out all the mandatory field in a xsd file?
I need to quickly see all the mandatory fields in the schema
thanks
Not sure if you're looking to do this through code. If not, Altova XMLSpy, for example, provides an option to "Generate Sample XML File" - with options to generate only mandatory fields.
Otherwise, if you're working with Java, for example, you can use something like the Eclipse XSD project for programmatic access to the XSD. (It even works without Eclipse.) Some additional details at Are there any other frameworks that parse XSD other than XSOM? .
Take a look at this post; instead of exporting all fields, there's also an option to get only the mandatory ones... One significant difference compared with the answer you accepted is in that you can also generate an Excel or CSV file, in addition to the XML file; not to mention that the sample XML approach is deficient by definition... I would pay attention to the way mandatory choices, abstract typed elements or abstract elements with substitution groups play in your case.

Good approach to XSD schema versioning?

The company I am working for at the moment codify the schema or contract version into the root node. For example,
<PurchaseOrder_v1_2 xmlns="http://someNamespace">
...
</PurchaseOrder>
I am looking for people's opinions on this design approach, as I am not convinced it is sound. For example, it requires that all services using this schema as a messaging contract are able to publish multiple versions to satisfy client requirements for different versions.
I would probably disagree with #hacktick's suggestion that versioning the namespace is conventional. I've never seen the W3C recommend that the namespace changes with version - certainly W3C namespaces don't do this - both versions of XSLT have the namespace http://www.w3.org/1999/XSL/Transform, for example.
Both encoding the version into the root and the namespace are changing the name of the element. In the case of the root, you are effectively stating that is an entirely different element with no defined relationship to the PurchaseOrder element in the previous version. In the case of the namespace change you are stating the same thing about *all the elements in the language.
Version attributes are more normal. May I suggest you read this thread on the XML-dev mailing list for some very well-informed discussion?
normally you versionize the url for the schema.
so you would have a schema called "schema" and you would then make reference to this like:
"http://www.example.com/2011/01/schema" where 2011 and 01 are versions in the form of year and month.
Example:
<PurchaseOrder xmlns="http://www.example.com/2011/01/schema">
</PurchaseOrder>
another approach is to use specify the version in the root element.
if your root-element for example is called "PurchaseOrder" you would add an required version attribute (""). your version attribute would contain a simple number that increments with each version of your xsd. you must save a history of all your public xsds. this could lead to easier xsd urls but the extraction and the validation of these xml-files is a little bit harder.
Example:
<PurchaseOrder version="1" xmlns="http://www.example.com/schema">
</PurchaseOrder>
If you versionize the root element name ("PurchaseOrder_v1_2") you would have conversion problems in your xml-files if you go for another version.
Personally i would go for solution 1 (versionized namespaces). this is also recommended from the w3c. can't find a link for this statement though.

bnf/ebnf for xml schema

I'm looking for a BNF/EBNF of XML Schema.
I just found the one for XML (http://www.w3.org/TR/REC-xml or extracted at http://www.jelks.nu/XML/xmlebnf.html).
Well it's a starting point, but I'm curious that I couldn't find a more specific one for XML Schema.
I guess because nobody finds that useful, and it would be too complex. If somebody want to define an XML language, such as XML Schema, they would probably use XML primitives like elements or attributes (using XML Schema, Relax NG, DTD, etc.), not characters. One of the reasons XML was invented is to have a meta language for creating other languages.
I think the fiirst step would be to start with the xsd for XML Schema and to use Colibri to generate a bnf grammar.
I will check when i m back home. The Colibri's author say:
Il s'agit là d'un premier jet que nous pourrions affiner.
But i definitevly think it got potential.

Resources