referential integrity in XML files without globally unique IDs

referential integrity in XML files without globally unique IDs - xsd

Maybe I'm not seeing the forest for the trees, but here it goes:
I'm "designing" an XML document and have so far come up with something like the following:
<element key="root">
<data>...</data>
<elements>
<element key="foo">
<data>...</data>
</element>
<element key="bar">
<data>...</data>
</element>
</elements>
</element>
So it's a simple hierarchical structure. What I want to do now is have references from one element to any other element anywhere in the hierarchy. That would be trivial if each element had a unique ID, but they don't. So far I only plan on guaranteeing that each element's key is unique within its level (much like file names in a directory structure).
In other words, if I had fully qualified keys such as root.foo, guaranteeing referential integrity would be simple. But then I'd be storing redundant information (I already know that foo is a sub element of root, why store that information twice?).
I realize that this is essentially a cosmetic problem. One of the simplest solutions is probably to just auto-assign IDs and be done with it. But this is fairly inelegant (and error-prone unless you have a nice front end for editing the file), so I was hoping for a nicer way to do it.
Is there a way to implement this in XML Schema?

Use <xs:key> and <xs:keyref>
Keys are unique within specified context so they don't need to be globally unique like ID:s <xs:key> contains <xs:selector> element that specify the scope or context of the key (key value/s must be unique across this set) and <xs:field> element that defines the key nodes. A key can have multiple fields in which case their combination must be unique. <xs:key> and <xs:keyref> are used within an <xs:element> declaration.

Related

OWL: only one property among many properties exists

I want to express the xs:choice element from XSD in OWL:
XML Schema choice element allows only one of the elements contained in the declaration to be present within the containing element.
I think maybe I should first define a property group in OWL, and then specify only one of the properties in the group is allowed exists. Any help?

I think maybe I should first define a property group in OWL, and then
specify only one of the properties in the group is allowed exists. Any
help?
There's no notion of "property group" in OWL, but you could get a similar effect using subproperties and disjoint properties. For instance, you could have a property hierarchy like this:
hasVehicleChoice
hasCar
hasTruck
Then, you can declare that hasCar and hasTruck are disjoint. That means that a individual can't have the same value for both properties. That means that you can't say:
x hasCar vechicle72
x hasTruck vechicle72
That's not enough to say that they can't have different values though. You could still have
x hasCar vechicle72
x hasTruck vechicle75
To avoid that, you could make hasVehicleChoice be a functional property (meaning each individual has 0 or 1 values for it, but no more), or use a subclass axiom with a restriction, like
Person subClassOf (hasVehicleChoice exactly 1)
Then, each person would have exactly one vehicle choice, and since hasCar and hasTruck are disjoint, the person can't have both.
All that said, this isn't a common pattern in OWL ontologies, and there's not a particularly convenient way of encoding it. If you don't need it all that often, you might be better off just using the subclass axioms and property restrictions directly. E.g.,
Person subClassOf ((hasCar exactly 1) and (hasTruck exactly 0)) or ((hasCar exactly 0) and (hasTruck exactly 1))

Different XSD for Create vs Update operations

consider the following object:
<Person>
<Name/>
<FirstName/>
<Street/>
<City/>
<SocialSecurityNr/>
<Gender/>
<Hobby/>
</Person>
Asume I use this object for a Create web service operation. When calling the Create operation, all fields of the Person object must be provided except 'hobby'
Now assume that I also have an Update operation. When updating, only socialsecuritynumber is mandatory: I do not need to update each field.
How do you process this in an XSD? Should you define seperate XSDs for the create and update operation?
The reason I want to make this distinction is that I do not want to send unnecessary fields from sender to recipient when it is not needed. Hence, I want to use minOccurs = 0 as much as possible.
It feels like this is a common problem, but can't find any references about it

It's really up to you. I've seen both approaches.
If I want a comprehensive description of the service contract, I would go with separate operations and just use common data types, defined in the xsd whenever needed.
The other option is to not use minOccurs=0. This will make the xsd shorter and somewhat more flexible, but also more open to interpretation (which, when describing a service more bad than good). If you need to give the wsdl to someone in order to consume your service in the future this will require more description of the different use cases (create/update) in the documentation. The other approach is more straightforward and intuitive for developers.

I always use a common structure for create and update.
When a data carrying element is declared in an XSD it should have the attrivute nilable set to true e.g.:
<xsd:element name="Result" type="xsd:string" nillable="true"/>
This allows for the xsi:nil attribute to be applied to the element.
[1]
<tag>Data</tag>
<tag xsi:nil=”false”>Data</tag>
<tag xsi:nil=””>Data</tag>
Tag is present and includes data. There may be an empty xsi:nil attribute or the attribute may not be present.
The target application needs to update the field with the specified data.
[2]
<tag></tag>
<tag xsi:nil=””></tag>
<tag xsi:nil=”false”></tag>
<tag/>
<tag xsi:nil=""/>
Tag is present and is self closing or does not include data. There may be an empty xsi:nil attribute or the attribute may not be present.
The target application needs to update the field to zero length data. E.g. an empty string.
Some applications may update the data to null.*
[3]
<tag xsi:nil=”true”>Data</tag>
<tag xsi:nil=”true”></tag>
<tag xsi:nil=”true”/>
Tag has the xsi:nil attribute set to true. It may or may not contain data and it may or may not be self closing.
The corresponding field should be updated to null.
[4]
Tag is missing from the xml.
No update should occur on the corresponding field

Require that an element has another element as a descendant

Is there a way in an xsd schema to require that an element have another element somewhere as a descendant?
For example, element parent requires a descendant desc. This is valid:
<parent>
<a>
<b>
<desc></desc>
</b>
</a>
</parent>
As is this:
<parent>
<c>
<desc></desc>
</c>
</parent>
but this isn't:
<a>
<parent>
<b/>
</parent>
</a>
The potential child elements for parent are many and complicates, so it would be difficult to enumerate every possible valid configuration.
Something like the key/selector schema elements seems like it would work, where I could provide an xpath expression defining the valid locations for desc element, but all of the examples I've found are aimed at matching up the value of attributes.

No, (almost) all XML Schema validation is shallow, called "local" in the spec. Here's one excerpt that emphasizes type validation as "local" validation.
Element Validated by Type If an
element information item is ·valid·
with respect to a ·type definition· as
per Element Locally Valid (Type)
(§3.3.4), [it is marked as] ·validated·.
The only exception is for the identity constraints like uniqueness and key-references which have a broad scope in an XML document but narrow uses.

I don't know if XSD supports what you are trying to do, but there is a work-around.
You could do complex validations with a two-step process:
First simply use your XSD schema for basic validation
Next use an XSLT which does more complex validations, and outputs the result of that validation
This may not plug in well to whatever framework you are working with, but might work well for (partially) custom code. It also has the advantage (over doing the extra validations in code) that you can publish both documents.
From a quick google search, one effort towards this end is Schematron. It actually foregos XSD entirely, and just uses XSLT. It appears to be a published standard:
http://www.schematron.com/

An XSD attribute to capture "source data field"

I have a domain model which is intended to generalise several source systems. As such, in certain cases the decision was made to overload data into new a generic field rather than to create several specific fields.
To account for this, when the source systems data is mapped onto the new domain model, I was hoping to record the source fieldname as an attribute, e.g.:
<Event>
<Description sourceField="subject">...</Description>
<Description sourceField="description">...</Description>
<Description sourceField="issue">...</Description>
<...>
</Event>
What would be the appropriate way to add such an attribute into the XSD? Would I need to specifically attach it to every such overloaded field, or is there a general way to allow an attribute across all elements?
Please don't point out that I should just add the extra fields into the domain model if I need to distinguish between the different data - the decision has been made, I just need to work around it!
Thanks in advance.

Not really.
If all your element declarations extend from a common base type definition, then you can add the attribute to the base.
If all your element declarations include an anyAttribute, you can make a global attribute definition for sourceField. Then the validator would at least allow your attribute but not require it. And if the anyAttribute is strict or lax the validator will make sure the attribute's content is valid.

XML Schema: How to validate an attribute with multiple keys concatenated?

Let's say I can get XML like this:
<Property Name="Title"/>
<Property Name="Content"/>
<Property Name="Address"/>
<Source properties="Title,Content,Address"/>
How coud I validate the "properties" attribute of "Source", so that any composition of the above listed "Property" items could be checked? (For example: "Title", "Title,Content", all of these concatenations are correct, while "Title, URL" is not correct.)

You can't do that within XML Schema. You can do it with your own higher level of validation based on XSLT, XQuery or Schematron, for example.

xan is right; validating always means, to match a XML file against a given schema. But there is no schema involved here, your problem is instead, to read a data file, and validate later entries against earlier ones (if the box above is supposed to represent one file) or one data file against another data file (if the gap is supposed to be a file separator). Beyond that, a schema defines the structure of elements and attributes and optionally data types (values only, if there is a strict enumeration of valid values). Also no match here, instead you want to verify data against data. Sorry, the tool of a schema mismatches the problem to solve.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

referential integrity in XML files without globally unique IDs - xsd

Related

OWL: only one property among many properties exists

Different XSD for Create vs Update operations

Require that an element has another element as a descendant

An XSD attribute to capture "source data field"

XML Schema: How to validate an attribute with multiple keys concatenated?

Categories

Resources