can i write a schema that all XML are valid to it? - xsd

i need to write a schema that all xml instances are valid to it.
i tried:
<xs:element name="Arguments">
<xs:complexType>
<xs:sequence>
<xs:any namespace="##any" minOccurs="0" maxOccurs="unbounded" processContents="lax" />
</xs:sequence>
</xs:complexType>
but it enforces a root element named Arguments.
is there a way for the root to be Any ?

Good question, although I am not sure it can be done. Your approach with using xs:any is good but I'm not sure it can be applied for an entire XML (i.e. the root) but just for a section of it.
To quote from a book I once read (something that puts a purpose on the "why?" people are asking): [...] useful when writing schemas for languages such as XSLT that routinely include markup from multiple vocabularies that are unknown when the schema is written [...] useful when you're just beginning to design the document structure and you don't yet have a clear picture of how everything fits together [...] (XML in a nutshell)
I'm also curious to see if it can be done or what is the best workaround for this.

The whole purpose of a schema is to constrain the potential space of valid documents. Either you're doing a whole document or you're doing a fragment. If you're doing a whole document, the correct approach is to just omit the schema entirely. Really. You've got no constraints at all (other than well-formedness) and so can't apply any interpretation to the document beyond it being an XML document.
The case where you've got a fragment is much more useful though. The best way of doing that is to have an outer element (whose name you control) that contains the uncontrolled fragment. When you do that, you need to say that the content is a sequence of zero-to-unbounded numbers of arbitrary elements, which is what you've done already. If it's really anything, you might also consider allowing mixed content (don't do that if you don't need it, of course, but if you're willing to handle things like XHTML in-paragraph content then that's what you require) and allowing arbitrary attributes on the containing element (see <xsd:anyAttribute>). It's also often a good idea to specify that there are constraints on what namespace the arbitrary elements can come from (##other being the most useful since it stops uncontrolled recursion in your schema).
So, apart from reviewing whether you've got the details right, you're probably best not trying to handle absolutely anything. Just make sure that your container element is defined right for your actual purpose.

Related

Jaxb Generates Objects for Unused Elements from Imported Schema

I have several schemas that inherit one or more elements from a collection of 'common' schemas. In this particular instance, I'm importing one of these schemas to make use of a single complex type defined in it.
When I generate the java objects from the schema, I get my schema types, and the element I referenced as expected, however I also get objects generated for the 30+ other types from the common schema.
I want to use the common schema, because I want to rely on automated builds for updating my schema when the common schema changes, but I do not want the extra java classes generated.
Suggestions ?
There's no out of the box approach to achieve what you want. The reason I am offering an opinion here is rather to point out (maybe for others) some issues one needs to take into account no matter which route ones go.
The 'extra' label is not always straightforward. Substitution group members are interesting. In Java, think about a class (A) using an interface (I), and a class (B:I) implementing (I). Some may say there's no dependency between A and B, while others would require B in the distribution. If you replace (I) with a concrete class, things become even less clear - consider that the substitution group head doesn't need to be abstract; or if the type of the substitution group head is anyType (Object in Java).
More so, if the XML processing was designed to accommodate xsi:type then it is even harder to tell (by looking at the schema) what is expected to work where.
Tools such as QTAssistant (I am associated with it) have a default setting that will pull in all strict dependencies (A and I above); and either ALL that might work (B above), or nothing else. Anything in between, the user needs to manually define what goes in the release. This is called automatic XSD refactoring and could be used easily in your scenario.

Why does complexType need to be avoided?

I came across this list of W3C XML Schema: DOs and DON'Ts and the part that says DO NOT use complex types kind of surprises me.
I don't find any trouble in using <xs:complexType name="SomeNewType"> and I don't see why using <xs:group name="someNewElement"> is better than using a complexType.
Should complexType really need to be avoided?
If so, why? What is so problematic about it?
What should be used instead?
For almost any language sufficiently complex and powerful to be interesting, different people will have different views on the best way to use that language.
You ask "should complexType really be avoided?" As you've seen, in the document you point to Kohsuke Kawaguchi (who was a member of the working group that designed XSD and has written several important XSD tools, including one of the first XSD validators) advises users to avoid certain constructs in XSD, including complexType, often on the grounds that the ratio of their advantages to their disadvantages (in particular: their complexity) is poor. You can (and indeed you must) make up your own mind on the persuasiveness of KK's arguments.
Since (as far as I can see) your question amounts to asking "is KK right?", I gather that you don't feel entirely comfortable making up your own mind, or at least that you are curious and want to know what others think.
In general, it's safe to say that many users of XSD, and many creators of XSD tools, disagree with KK. They find the ability to derive complex types from other complex types helpful, and they use it.
It's equally safe to say that many others agree with KK that this or that construct of XSD is too complex and is better avoided. Some who feel this way write documents describing "best practices" which lay out constructs to use and constructs to avoid; sometimes the contents of such documents is useful and sometimes not. Many who agree most fervently with KK about XSD's complex types do not use XSD at all, when they can avoid doing so: they prefer Relax NG (or in some cases, DTDs or Schematron).
In other words: the opinions of qualified observers vary, as do those of unqualified observers. I know some very smart people who disagree with KK on this question. And I also know that KK himself, and some of those who agree with him, are also very smart. I have no intention of choosing a side here.
Finally, you ask If it is problematic, why does it exist in the language construct in the first place? Any construct in any language exists because those who designed the language thought it was worth including. In the case of XSD and complex types, that means: complex types are in XSD as nameable objects because essentially everyone in the responsible working group wanted them in the language. This does not prevent them from being problematic: like everyone else, people of great technical skill can make mistakes, and groups that must seek consensus can under the right circumstances produce designs that no one really likes very much (but which everyone prefers to the alternative of not having the construct in question at all; sometimes compromise is the only way to get something done).
I hope this helps.

XML Schema element required depending on enumeration value

Is there a way to require an element depending on the enumeration value entered for another element?
Basically, what I'm trying to do is have an user interface type defined by an enumeration. Depending on the interface type, some fields may or may not be required.
I understand this could be achieved either by making the elements optional, and handling the conditional logic in the Code, or by making different complex types for every possible interface type. However, I want the rules to be apparent to anyone reading the schema, so the Code solution wouldn't be ideal, and adding a ton of complex types, even though they share most of their required fields in common, would add a lot more to the xml parsing logic.
Is it possible to have all this logic contained in the schema to simplify validation and parsing?
This kind of conditional mandatory/optional element inclusion is not possible with XSD.
You can do that using RelaxNG.
You will have a good technical data interface with this kind of schema (describing exactly the structure you want, with a tool like Jing for validation).
And then, if you want to use object mapping, you can use Trang to convert your RelaxNG schema to an XML Schema (the XML Schema will be little more loosy, the true data interface beeing described in RelaxNG).

Writing annotation schemas for Callisto

Does anybody know where I can find documentation on how to write annotation schemas for Callisto? I'm looking to write something a little more complicated than I can generate from a DTD -- that only gives me the ability to tag different kinds of text mentions. I'm looking to create a schema that represents a single type of relationship between five or six different kinds of textual mentions (and some of these types of mentions have attributes that I need to assign values to), and possibly having a second type of relationship between the first two instances of the first type of relationship.
(Alternatively, does anybody know of any software that would be better for this kind of schema? I've been looking at WordFreak, but it's a little clumsy, and it doesn't support attributes on its textual mentions.)
Creating an XML DTD with one XML tag for each type of textual mention (and attributes on the tags to indicate the attributes), and using an "id" attribute on every tag that the annotator has to fill in himself (a monotonically increasing integer). Then I used the DTD schema generator to create a Callisto schema.

Conventions for annotating appinfo in xml-schema?

I believe the three below are syntactically correct; but which are permitted according to conventions (especially in the enterprise)?
The first one below is used in most examples I've seen (e.g. JAXB), but it's verbose:
<xs:annotation>
<xs:appinfo>
<myinfo>don't panic</myinfo>
</xs:appinfo>
</xs:annotation>
This second one below is allowed because any attributes are permitted on <appinfo> (that aren't in xml schema's own namespace). It's shorter, and seems reasonable - but is it conventional?
<xs:annotation>
<xs:appinfo myinfo="don't panic"/>
</xs:annotation>
This last one below is my favourite, because it's so short and doesn't clutter up the schema. I'm sure it's legal, because like <appinfo>, any attributes are permitted on an <annotation> (again, provided non-xml schema namespaced). But it encodes application info without using an <appinfo>, so I'm afraid it would be frowned upon. Would it be?
<xs:annotation myinfo="don't panic"/>
Many thanks for your comments!
Interesting examples of a somewhat "standard" usage of <xs:appinfo> are at http://docstore.mik.ua/orelly/xml/schema/ch14_02.htm and at http://docstore.mik.ua/orelly/xml/schema/ch15_01.htm#ch15-77057
Note that <xs:appinfo> should contain information to be processed by an application, not human readable notes (use <xs:documentation> for this).
I don't think that you are hearing anything because I don't think that these annotations are that commonly used, and the whole point of the spec is to allow anyone to put anything that helps them in there.
The only convention I've seen is XHTML documentation, which of course doesn't look like any of your options.
You might look into whether processing libraries like Apache XmlSchema treat all of your options equivalently.

Resources