How to get an ordered list parsed by XML parser? - xsd

I am using a xsd schema file; there I specified an ordered list.
When parsing an XML node of the kind...
<myOrderedList> "element_1" "element_2" "element_3" </myOrderedList>
(which is valid XML syntax)
...all XML parsers I know parse this as a single node element.
Is there a way to get the XML parser parse this list for me (return it as a list or an array or whatever) or do I always have to parse it myself?

Why not make use of XML's ability to structure your data, and put each element in it's own XML element ? e.g.
<myOrderedList>
<element>1</element>
<element>2</element>
<element>3</element>
</myOrderedList>
etc. Otherwise you're having to implement parsing (albeit in a simple fashion) on top of the parsing effort that the XML parser is performing for you ?
If you do the above, the parser will return you the ordered list without any further work, and/or you can process it more easily using standard XML tooling like XSLT/XQuery etc.

Values in a list type are always separated by whitespaces so you can easily pass that through a tokenizer to get the list of values.
There are technologies like XSLT 2.0 schema aware that will see the list of values for such an element.
Using elements as proposed in the other answer is also a solution and may ease your processing. In XML child nodes are ordered so you should not worry about that. A possible representation would be:
<myOrderedList>
<value>element_1</value>
<value>element_2</value>
<value>element_3</value>
</myOrderedList>

Related

MarkLogic Text Search: Return Results Based on Matches Inside Attributes

I am using Marklogic's search:search() functions to handle searching within my application, and I have a use case where users need to be able to perform a text search that returns matches from an attribute on my document.
For example, using this document:
<document attr="foo attribute value">Some child content</document>
I want users to be able to perform a text search (not using constraints) for "foo", and to return my document based on the match within the attribute #attr. Is there some way to configure the query options to allow this?
Typing in attr:"foo" is not a workable solution, so using attribute range constraints won't help, and users still need to be able to search for other child content not in the attribute node. I'm thinking perhaps there is a way to add a cts:query OR'd into the search via the options, that allows this attribute to be searched?
Open to any and all other solutions.
Thanks!
Edit:
Some additional information, to help clarify:
I need to be able to find matches within the attribute, and elsewhere within the content. Using the example above, searches for "foo", "child content", or "foo child content" should all return my document as a result. This means that any query options that are AND'd with the search (like <additional-query>, which is intended to help constrain your search and not expand it) won't work. What I'm looking for is (likely) an additional query option that will be OR'd with the original search, so as to allow searching by child node content, attribute content, or a mix of the two.
In other words, I'd like MarkLogic to treat any attribute node content exactly the same as element text nodes, as far as searching is concerned.
Thanks!!
You could accomplish this search with a serialized element-attribute-word cts query in the additional-query options for the search API. The element attribute word query will use the universal index to match individual tokens within attributes.
In MarkLogic 9 You may be able to use the following to perform your search:
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
search:search("",
<options xmlns="http://marklogic.com/appservices/search">
<additional-query>
<cts:element-attribute-word-query xmlns:cts="http://marklogic.com/cts">
<cts:element>document</cts:element>
<cts:attribute>attr</cts:attribute>
<cts:text>foo</cts:text>
</cts:element-attribute-word-query>
</additional-query>
</options>
)
MarkLogic has ways to parse query text and map a value to an attribute word or value query.
First, you can use cts:parse():
http://docs.marklogic.com/guide/search-dev/cts_query#id_71878
http://docs.marklogic.com/cts.parse
Second, you can use search:search() with constraints defined in an XML payload:
http://docs.marklogic.com/guide/search-dev/query-options#id_39116
http://docs.marklogic.com/guide/search-dev/appendixa#id_36346
I'd look into using the <default> option of <term>. For details see http://docs.marklogic.com/guide/search-dev/appendixa#id_31590
Alternatively, consider doing query expansion. The idea behind that is that a end user send a search string. You parse it using search:parse of cts:parse (as suggested by Erik), and instead of submitting that query as-is to MarkLogic, you process the cts:query tree, to look for terms you want to adjust, or expand. Typically used to automatically blend in synonyms, related terms, or translations, but could be used to copy individual terms, and automatically add queries on attributes for those.
HTH!

Difference between XSD Simple element and XSD Complex element

I Googled this Question but still i'm unable to find the best difference for the Simple XSD (XML Schema Definition) Element and Complex XSD Element.
Any guidance would be highly appreciated.
I have no idea, why I answer this. But...
To summarize,
simple types can only have content directly contained between the element’s opening and closing tags. They cannot have attributes or child elements.
complex types can have attributes, can contain other elements, can contain a mixture of elements and text, etc etc.
One is a single value and the other a compound value.

XML Schema for choice between element and #PCDATA

I have a preexisting XML document type that has an element that can have two content types: some elements, or just text. Modeling this as mixed content is overkill, and JAXB's XJC creates a very ugly binding as a result.
<bars><bar .../><bar .../></bars>
versus
<bars>Just a bunch of #PCDATA</bars>
xs:choice seems structured only for complex types (not simple types like xs:string). Is there a way to express this choice, between elements or text, using XML schema? In DTD this would be something like
<!ELEMENT bars (#PCDATA | bar*)>
The language you want to define (either a sequence of character or a sequence of bar elements, but not a mixture) cannot be defined in XSD 1.0 (or in XML DTDs, either; your DTD notation would make sense but is not legal in XML DTDs).
In XSD 1.1, you can use an assertion to ensure that if any bar elements are present as children, no text nodes occur (or only text nodes that contain only whitespace).
A simple way to achieve roughly the same effect is to say that the bars element contains either a sequence of bar elements or a single stringvalue element (call it whatever you like), where the stringvalue element contains -- as its name suggests -- just a string of characters.

XSD for Element Or OtherElement Or Text

I need an XSD for the following construct:
Expression := <FunctionCall> | <OperatorConstruct> | <Variable> | Constant_Text
In other words, the Expression type consist of a choice between 3 other types and a text.
I know there is a xs:Choice element, but I can't figure out how to write the 'Or Text' part. Simply using mixed=true on the Expression element allows to enter text AND other elements, but I would like to limit to only one from those four.
So the question is, what xsd can I define that allows one of three elements or a text?
If you want your structure validated by an XML Schema, you'll have to make all four choices into elements. MathML expressions work that way, with elements for every term.
Or you could go with mixed and validate the structure outside of XSD (with XSLT or Schematron or your own parsing code).

Given an XSD is it possible to list a hierarchy of elements and their attributes?

Lets pretend I have an XSD document and I want to produce a list of all elements along with their attributes and the children of the elements. I could also approach this by asking if you are to implement code completion based on an xsd document, and you want to list the children of the element and an elements attributes, how would you approach this problem?
Since XSD is valid XML document it just a matter of selecting XML parsing library of your choice. For example XLinq (.NET FW 3+) will do the job.
You can just walk through complexType, sequence and other elements to find out a list of possible values.

Resources