Splitting huge XSDs? - xsd

I have this big huge XSD which contains many complex elements.
I want to split this huge XSD into smaller XSDs, containing the complex elements as standalone XSDs.
Are there any tools for this?

Altova's XmlSpy doesn't provide an automated splitter but its graphical interface does allow you to copy/paste your types and elements easily from one XSD to another.

Related

Visual Paradigm: Underline togetherness in class diagram

I have to create a class diagram with different design patterns inside. I would like to underline the togetherness of the classes belonging to the same design pattern.
What is the "official" or best way in UML to achieve this? I have to use Visual Paradigm and found a solution using packages.
But these packages seems to be very unhandy. I need something more freeform like. I think about just using different background colors for the different patterns.
The coloring of element is not the concern in UML as it won't affect the model structure and their relationships. Specifying different color is a good way for visualize categorize your elements in diagram level (same element on different diagram can have different presentation setting). If you want to categorize the elements in model level then you may consider to use stereotypes plus stereotype based presentations so that elements applying the stereotype can automatically inherit the presentation settings (see https://www.visual-paradigm.com/tutorials/stereotypeappearance.jsp)

Why is Dita's schema split into topic's and map's?

I'm new to Dita, so I apologize for any ignorance.
I'm using XJC to compile the base (and only the base) Dita 1.3 schema into Java classes. When I attempted to compile all the XSD files, I received errors with elements and groups being redefined. None of the XJC bindings I attempted to write would fix it.
After digging through the schema, I found that mapGrp.xsd/mapMod.xsd and topicGrp.xsd/topicMod.xsd contained the same group and element definitions. This explains why XJC would fail when including all of the XSD files. The XSD parser itself cannot handle these duplicate entries.
So I generated basemap.xsd and basetopic.xsd separately and cleaned up the generated code so I could run a diff against the two directories.
I found that the two schema's have some elements specific to maps and topics. For example, the map schema has DitavalmetaClass and DvrKeyscopePrefixClass while the topic schema doesn't. And the topic schema contains AbstractClass and BodyClass while the map schema doesn't. But the majority of classes are shared between the two schema's.
As for the classes that are shared, there are only three that have some differences between the two schema's (LinktextClass, MetadataClass, and SearchtitleClass). Even then, they aren't big changes, just some differences in what they can contain.
My question is, why couldn't the shared classes go under one common Grp/Mod schema that's shared between topic's and map's and redefine those three classes? Can I change the two schema's so they share the same elements and groups without breaking any of the other schema's that extend the base schema?
Maps and topics are two distinct document types. They share some element types in common but are otherwise completely independent document types.
Note also that DITA does not have "a single grammar" in the way that other XML applications do (or appear to do).
DITA is explicitly architected to allow controlled extension from the base grammars so that you can do any of the following:
Configure a given map or topic type to include or exclude specific elements, either other topic types (in the case of topics) or specific element "domains" (sets of "mix-in" elements). Thus there can be two different working grammars for "topic" documents that allow different sets of elements.
Add "constraint" module that restrict existing content models or attribute lists in some way (for example, disallow a base element type in a specific context).
Define your own new element types and attributes via "specialization"
Thus any attempt to generate things like Java classes or database schemas for DITA in any sort of static way are doomed to fail in the general case.
If you are implementing code that needs to operate on any conforming DITA document then it needs to be more flexible and operate on elements in terms of their #class values, not their tag names.
If it is sensible for you to generate Java classes from the XSDs you must treat each top-level map and topic type (map, bookmap, subjectScheme, learningMap, topic, concept, task, general-task, reference, glossentry, etc.) as a distinct class hierarchy--you cannot combine them in a single hierarchy because they will have different content model rules for the same element types.
You should definitely read the DITA Architecture specification:
http://docs.oasis-open.org/dita/v1.2/os/spec/architectural_specification.html#architectural_specification
Cheers,
Eliot
The DITA 1.3 XML Schemas are not manually put together anymore. They are being generated automatically from Relax NG schemas which are the officially supported schemas by the specification. Maybe you should also take a look at the DITA 1.2 XML Schemas, those were manually written and they might be better modeled to reuse more element definitions.

Castor Generated Classes (XML Marshalling) - XSD Unavailable

I recently moved to a project where I noticed there have a specific requirement to store some data as XML.
The prior team used Castor generated classes to Marshall and Unmarshall the data.
I have a new requirement now that requires me to add some additional (yet optional) fields to this XML. However I realized the prior team supposedly never checked in the XSD at all and I have no way to reach out to them.
The XSD for sure was large and complex since it is responsible for generating around 50 classes. So writing the XSD again is going to be error prone and also a risk that I might end up creating XMLs now that are in compatible with the old XML.
The other alternative I thought of was using a tool like XML Spy and try to reverse engineer the XSD from the XML, however that sounds a bit difficult too since I will need to reverse engineer 20 odd XMLs to generate XSDs and then merge all these XSDs into one, since the XML had several optional sections. There is still an element of error possible in this approach.
The best option I can think of is reverse engineering the classes to an XSD - however Castor supposedly does not support this feature. So I don't have the means to convert these Castor generated classes back to an XSD! While the classes generated by Castor do have some Castor specific methods, in essence they are Pojos if the Castor specific methods are ignored!
Do we have any suggestions here for getting or generating the XSD from java classes? Do we have any other suggestions to solve the issues I discussed?
Thank you.
Just an update, while I have not achieved 100% of what I was looking for, I was able to successfully reverse engineer the XSD using JAXB's schemagen tool.
Just note that castor generates an XXXDescriptor with each class since that does NOT map to the actual XSD do not pass the XXXDescriptor classes as input to the schemagen tool.
The schemagen tool works with the getter methods and ignores methods like Castor's validate, marshall and unmarshall.
So things look quite hopeful at this point, compared to the situation I was in when I first posted the question.
Thanks.

What's the convention with types and separate XML schema files?

I'm developing a rather large XML Schema to integrate multiple disparate systems. Each system has messages that consist of multiple fields. I've written the schemas so that each field is represented as a type (either simple or complex), while the entire message is represented as an element with a sequence of multiple types.
When defining types using XSD, what is the convention in terms of how you break up your files?
I've been going down the path of having a separate XSD file for each type that I am defining, then a single XSD file that defines the message and imports the required XSD files.
Here's a simplified example showing the structure (can't post images, no reputation).
Type schemas (fields) have a namespace ending in "types", while message schemas have a namespace ending in "messages".
My reasoning was that it would be easier to compose new types using existing types if they were defined in their own file, and that versioning types might be easier.
Is there a better approach? I'm relatively new to XML and Schemas. Thanks in advance.
It makes no sense to have one schema per type. One schema per domain area, yes, but not one per type.
For instance, you might have one schema to define types related to customers, and another schema to define types related to products. You might have a separate schema to define common types like people and addresses, which might be used by the customer schema and also by the product schema (product sales rep, vendor address).
You might combine these in a schema related to order processing, since it involves orders made by customers for products.
Are the separate xsd files you created completely unrelated to eachother? For example if one xsd file handles the sales domain and the other one handles the payments domain, you should give each schema a unique domain specific targetNamespace. This approach is called "Heterogeneous namespace design" and you are using it now. I believe it is best suited for your requirement. Since you are having multiple disparate systems it is better to maintain them in separate xsd files with different targetNamespaces. This enhances re-usability and reduces redundancy of types/fields. It also prevents namespace collisions. However if you have a large number of imports it becomes a challenge to manage and there will be dependency on the granularity of the types defined in the imported schemas.
If they are related to each other you should go for "Homogeneous namespace design".
There are two other approaches for XML Schema design:Homogenous Namespace Design and Chameleon Namespace Design which may not best suit your specific requirement. You can further get details on the benefits/drawbacks of all these three different schema design approaches at http://technologyandleadership.com/three-schema-design-approaches/

Formatting XSD scheme for peer review

I designed a data model which is represented by an XSD scheme.
The data model also provides the types that are being used as web service parameters in a WSDL descriptor.
I would like to send the XSD scheme around and ask the people involved to peer review the data model.
What tool or presentation method would you suggest to be used as a basis for peer reviews? The data model should be readable for non-skilled people, at least when it comes to the semantic meanings of the parameters
Edit:
To be more specific: Of course, syntactically, the scheme validates. Actually I'm already working on code which is based on JAXB generated classes. My goal is
to freeze the data model and thus
the input parameters
to make sure
nothing got lost or forgotten from a
semantic (in the meaning of
business-relevant) point of view.
Edit 2
I've been thinking about how it probably would be best to spread a datamodel around. I'm thinking of something like a JavaDoc for XSD schemas. Anyone knows if something like that exists? Basically it would be done with a set of XSLTs, right?
I know the following tools that generate documentation from XML Schema files (XSD):
xs3p
XSLT stylesheet that generates single XHTML from XSD
xsddoc
free / LGPL
mainly XSLT based
JavaDoc like output
see xsddoc examples
xnsdoc
improved commercial version of xsddoc
free for personal/educational use
JavaDoc like output
XSDdoc 2.0
commercial
JavaDoc like output
For small a XML schema, I would probably suggest using the xs3p XSLT stylesheet. For more a complex schema, I suggest using xsddoc.
I recommend using the XSD for something. Specifically, show some actual applications, with examples as real code.
Actual applications are what make a schema interesting. The examples don't have to be big, sophisticated or completely realistic. They just have to compile. Other people will want to copy and paste the code samples.
These examples are the "hello world" of the schema. And they act as a kind of unit test for the schema.
The closest thing to Javadoc for an XML schema that I've seen is running the Javadoc tool on source generated from the schema. This requires two things: 1) That your schema has internal annotation elements documenting it, and that 2) your source generator uses those annotations as Javadoc elements.
The very useful Oxygen XML developer also supports generating documentation, see
http://www.oxygenxml.com/xml_schema_documentation.html
(commercial, but there's a fully functional 30 day trial available)
I'll try it out now, need a simple way to generate a document with all types and available xsd:documentation description as a simple interface description...
** Disclosure : I work for Innovasys, the producer of the documentation tool mentioned below *
You could take a look at Innovasys Document! X. As well as automatically generating a structured and linked page for every element, simple type, complex type, group and attribute group it will also generate linked XSD diagrams (including sequences/choice etc.) and structure tables that include the annotations from your XSDs and make sense of the relationships between the elements in your schemas. The output is template based so you can adapt it to your preferred style and structure. It will build output to web ready html or compiled help files.
Uniquely it also includes a WYSIWYG editor that allows you author additional content to supplement the stuff that's automatically generated and the annotations from the XSD source - so you can provide additional contextual information for your peer review. There is also a Community Extensions feature that allows people viewing the generated output to record comments and feedback and that can be viewed and actioned directly from within Document! X.

Resources