I have an XML Schema referencing two other schemas where the same element is declared twice resulting in an invalid schema.
Is there any built in XSD construct allowing me to handle this situation by ignoring one occurrence or is this situation just fundamentally wrong?
That's fundamentally wrong - a schema is specifically intended to make things clear and unique. You need to address that somehow - XSD has no way of ignoring something that's in the schema - anything in the schema file must be valid.
Related
I am using a JSON Schema to validate a file. This is somehat similar to an XML XSD.
I have a few questions concerning the id field.
Does the schema still works without network connection ?
The URL in the id should be accessible from a web browser ? i.e. if 'id' = "https://example.com/question", does this mean that we should be able to access the schema from a browser by going to https://example.com/question ?
I am a bit lost on this subject. I know that it is best practice to have an id property as a unique identifier for every schema, and that this gets most useful when creating a complex schema with different schemas that reference each other.
But I am not sure if we need to assign a URL to the id field or not. And I'm also lost concerning the implication of having this URL for the schema.
Thank you very much for your help
Main purpose of id ($id since draft-06) is to organize scope for $ref resolving.
$id does not have to be an existing HTTP resource. Identified schema can be even defined in another one (example in spec test suite).
JSON Schema spec expects that validator should be able to resolve references based on $ids defined in current schema. Remote references should be also resolved, but there are no limitations on how exactly it should happen.
In many cases network interactions during validations are very unwanted due to high latency. Most implementations provide you a way to preload/define schema resources by $id explicitly before validation.
According to spec root schema SHOULD have $id which is an absolute URI, but whether or not it should be accessible with HTTP client is up to you and your validator.
$id is only defined to be an URI.
http://json-schema.org/draft-07/json-schema-core.html#rfc.section.8.2
See RFC-3986 Uniform Resource Identifier (URI): Generic Syntax
https://www.rfc-editor.org/rfc/rfc3986
"A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource."
A nice write-up by Daniel Messier provides a clear explanation of the nature of an URI - which can just be an URN - but may also be a valid URL
https://danielmiessler.com/study/url-uri/
I Googled this Question but still i'm unable to find the best difference for the Simple XSD (XML Schema Definition) Element and Complex XSD Element.
Any guidance would be highly appreciated.
I have no idea, why I answer this. But...
To summarize,
simple types can only have content directly contained between the element’s opening and closing tags. They cannot have attributes or child elements.
complex types can have attributes, can contain other elements, can contain a mixture of elements and text, etc etc.
One is a single value and the other a compound value.
My XML files have restrictions on the child elements, but it really doesn't matter what the name of the root element is. How can I incorporate this into my XSD? I've tried using <xs:any> but I get:
"S4s-elt-invalid-content.1: The Content Of 'schema' Is Invalid. Element 'any' Is Invalid, Misplaced, Or Occurs Too Often."
So I tried missing the name off the element tag like this: <xs:element> but then I get:
"S4s-att-must-appear: Attribute 'name' Must Appear In Element 'element'."
Use a named type, and tell your validator to start validation at the root element using that type.
(There is one possible hitch with this: XSD 1.0 suggests that as one possible invocation option, but does not require validators to provide it, so there's no guarantee the validator interface you use will support it. Depends on your validator. Worth trying, at least.)
Another way to put this: you already have what you are asking for, because your XSD schema never cares what the root element of your document instance is called. An XSD schema provides a set of element and type declarations (among other things). A validator can be requested to start the validation at any point in the document, not just the root, and with either an element declaration or a type declaration, or in 'lax wildcard mode' (the most common default). If your validator doesn't offer the invocation options you want, it's a flaw in your choice of validator, not a gap in XSD.
I think I might just make the requirement stricter and insist on using a particular tag as the root element. The fact that the application doesn't care is not really a problem.
It seems (to me) strange that this limitation exists, but I am new to XSDs.
Can I express this in an XSD?
For example:
One element is a required bool element named EmployedMoreThanThirteenWeeks and if the value is set to false I want the schema to require the existence of another element named EmploymentDate. And the other way around if the value is true then ideally the EmploymentDate element should be denied but I can accept it being optional.
No. An XSD just defines structure and data types, not relations. It is possible to add a key reference between elements but that won't prevent invalid nodes, just invalid values.
You can create an XSLT file (an XML Stylesheet) which will validate the XML file for you and thus generate a report of errors.
I think that XSD CANT do that, because the schemas verifies just an STRUCTURE (tree), and not VALUES (though you can check the value format).
You should consider other validation ways.
Suppose we have a stylesheet which pulls in metadata using the key() function. In other words we have instance documents like this:
<items>
<item type="some_type"/>
<item type="another_type"/>
</items>
and a table of additional data we would like to associate with items during processing:
<item-meta>
<item type="some_type" meta="foo"/>
<item type="another_type" meta="bar"/>
<item type="yet_another_type" meta="baz"/>
</item-meta>
Finally, suppose we want to do schema validation on the instance document, restricting the type attributes to the set of types which occur in item-meta. So in the schema we want to use key/keyref instead of restriction/enumeration. This is because using restriction/enumeration will require making a separate list of valid type attributes.
However, it doesn't look like key/keyref will actually work. Having tried it (with MSXML 6.0) it appears the selector of a schema key won't accept the document() function in its xpath argument, so we can't examine the item-meta data, whether it appears in an external file or in the schema file itself. It looks like the only place we can look for keys is the instance document.
So if we really don't want to have a separate list of valid types, we have to do a pre-validation transform, pulling in the item-meta stuff, then do the validation, then do our original transform. That seems overcomplicated for what ought to be a relatively straightforward use of XML schema and stylesheets.
Is there a better way?
Selectors in key/keyref allow only a very restricted xpath syntax. Short, but not completely accurate: The selector must point to a subnode of the element declared.
The full definition of the restricted syntax is -> here.
So, no I don't see a better way, sorry.
BTW: The W3C states that this restriction was made to make life easier on implementers of XML Schema processors. Keep in mind that one of the design goals of XML Schema was to make it possible to process a document in streaming mode. That explains really a lot of the sometimes seemingly random restrictions of XML Schema.
Having thought about it a little more, I came up with the idea of having the stylesheet do that part of the validation. The schema would define the item type as a plain string, and the stylesheet would emit a message and stop processing if it couldn't look up the item type in the item-meta table.
This solution fixes the original problem of having to write down the list of valid types more than once, but it introduces the problem that validation logic is now mixed in with the stylesheet logic. I don't have enough experience with XSD+XSLT to tell whether this new problem is less serious than the old one, but it seems to be more elegant than what I wrote earlier about pulling the item-meta table into each instance document in a pre-validation transform.
You wouldn't need to stop the XSLT with some error. Just let it produce something that the schema won't validate and that points to the original problem like
<error txt="Invalid-Item-Type 'invalid_type'"/>
Apart from that please keep in mind that there are no discussion threads here. The posts may go up and down, so it's better to edit your question accordingly.
Remember, the philosophy here is "One question, and the best answer wins".