search and replace an element, using xslt 3, the replacement phrase is the same - xslt-3.0

while i have as input an xml file like:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description>
</book>
</catalog>
and i try to find the best way to have the following info in a file, or in the xsl itself:
value to search for:
An in-depth look at creating applications with XML.
add location:
on the self
value to search for:
A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.
add location:
on the self
so if i made a comma separated input file, it would look like:
"An in-depth look at creating applications with XML.","on the self"
"A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.","on the self"
i have tried with xslt 2, but i keep getting errors like a sequence of more than one item is not allowed as the value of variable $search_phrase...
Desired output:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>to be checked</description>
<location>on the self</location>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>to be checked</description>
<location>on the self</location>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description>
</book>
</catalog>
Could someone give me an example with xslt-3.0 where probably i could replace the above phrases, and add the needed elements as well, wherever there is a match?
What i need to do:
in the full xml file, there are many records that can have the same description. I also need to make an exact match on the description: The phrase
"An in-depth look at creating applications with XML, authored by ..."
should not be matched. And in my case, i have also a description where the difference is only the case for instance, "an in-depth look at creating applications with XML." should not be also matched. Since in my code i use lowercase, this may also be the problem, but not sure... Whenever there is a match, the location specified along the search term, must be added into the location element, which currently does not exist in any record in the xml.

Here is a suggestion on how to compare the description elements to a sequence of strings passed in as a parameter (but you could of well read it in from a file):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
expand-text="yes"
version="3.0">
<xsl:param name="new" as="xs:string" select='"on the self"'/>
<xsl:param name="replace" as="xs:string" select="'to be checked'"/>
<xsl:param name="search" as="xs:string*"
select='"An in-depth look at creating applications with XML.",
"A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."'/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="description[. = $search]">
<xsl:copy>{$replace}</xsl:copy>
<location>{$new}</location>
</xsl:template>
</xsl:stylesheet>
Works fine at http://xsltfiddle.liberty-development.net/eiQZDbk, but only after editing the sample to have all description data on one line.
If that is not the case then changing the template to
<xsl:template match="description[normalize-space() = $search]">
<xsl:copy>{$replace}</xsl:copy>
<location>{$new}</location>
</xsl:template>
should help: http://xsltfiddle.liberty-development.net/eiQZDbk/1
If you have several terms to relate to each other than some XML format seems be more appropriate to structure the data, so in
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
expand-text="yes"
version="3.0">
<xsl:param name="data-url" as="xs:string" select="'data.xml'"/>
<!-- if you want to load from a file use xsl:param name="replacement-doc" select="doc($data-url)" -->
<xsl:param name="replacement-doc">
<root>
<search>
<term>An in-depth look at creating applications with XML.</term>
<replacement>to be checked</replacement>
<new>on the self</new>
</search>
<search>
<term>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</term>
<replacement>whatelse</replacement>
<new>something</new>
</search>
</root>
</xsl:param>
<xsl:key name="search" match="search" use="term"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="description[key('search', normalize-space(), $replacement-doc)]">
<xsl:variable name="search" select="key('search', normalize-space(), $replacement-doc)"/>
<xsl:copy>{$search/replacement}</xsl:copy>
<location>{$search/new}</location>
</xsl:template>
</xsl:stylesheet>
I have made some suggestion to do that and have adapted the template. Online sample is at http://xsltfiddle.liberty-development.net/eiQZDbk/2. As indicated there in a comment you can adapt that approach to load the data from a separate file instead of keeping it inline in the XSLT.

Related

How to map repeating xml elements in Excel

I'm dealing with a problem on how to map repeating xml elements, the moment I imported the XML below as an XML map in excel, I see only 1 for the 4 records i need "<club_LIST>" of course this doesn't produce the 4 entries in the output XML.
Any idea how can this be solved in Excel ?
I from Microsoft support:
Additionally, the contents of an XML mapping cannot be exported if the contents contain one of the following XML schema constructs:
List of lists One list of items contains a second list of items.
Any way around you could suggest to produce the xml?
Here below the sample of my data:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<PartnersProfile xmlns:xsi="http://www.w3.org/2001/xmlschema-instance">
<ID>10</ID>
<NAME>Table 10</NAME>
<Record>
<PARTNER_ID>1</PARTNER_ID>
<DESCRIPTION>customer</DESCRIPTION>
<subscription>2004</subscription>
<club_LIST>1</club_LIST>
<club_LIST>4</club_LIST>
<club_LIST>6</club_LIST>
<club_LIST>9</club_LIST>
</Record>
<Record>
<PARTNER_ID>1</PARTNER_ID>
<DESCRIPTION>customer</DESCRIPTION>
<subscription>2004</subscription>
<club_LIST>1</club_LIST>
<club_LIST>4</club_LIST>
<club_LIST>6</club_LIST>
<club_LIST>9</club_LIST>
</Record>
</PartnersProfile>
Consider transforming the XML to an itemized version by each club_LIST with repeating values for ancestor elements. You can run the below XSLT 1.0 by many tools and programming languages including Perl (from your profile) or Excel VBA.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/PartnersProfile">
<xsl:copy>
<xsl:apply-templates select="descendant::club_LIST"/>
</xsl:copy>
</xsl:template>
<xsl:template match="club_LIST">
<Record>
<xsl:copy-of select="ancestor::PartnersProfile/*[name()!='Record']"/>
<xsl:copy-of select="ancestor::Record/*[name()!='club_LIST']"/>
<club_LIST><xsl:apply-templates select="node()"/></club_LIST>
</Record>
</xsl:template>
</xsl:stylesheet>
Online Demo

hello world xml -> json template

I'm trying to transform an xml input to a json output. My XSLT 1.0 is pretty proficient my XSLT 2.0/3.0 not so.
I thought I'd start with a hello world style template and build from there.
My belief is that you can simply create an output as map/array data structure and then some magic will map that into the desired output, so this is my first attempt (I've not defined an input, because any old xml will do in this example, it ignores it):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="3.0">
<xsl:output method="json" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="foo">
<map xmlns="http://www.w3.org/2005/xpath-functions">
<string key='desc'>Distances between several cities, in kilometers.</string>
<string key='updated'>2014-02-04T18:50:45</string>
<boolean key="uptodate">true</boolean>
<null key="author"/>
<map key='cities'>
<array key="Brussels">
<map>
<string key="to">London</string>
<number key="distance">322</number>
</map>
<map>
<string key="to">Paris</string>
<number key="distance">265</number>
</map>
<map>
<string key="to">Amsterdam</string>
<number key="distance">173</number>
</map>
</array>
</map>
</map>
</xsl:variable>
<xsl:value-of select="xml-to-json($foo)"/>
</xsl:template>
</xsl:stylesheet>
this almost works but I get a string output...(the '"' chars exist in the output file includeing all the escaping, so not a valid json output).
"{\"desc\":\"Distances between several cities, in kilometers.\",\"updated\":\"2014-02-04T18:50:45\",\"uptodate\":true,\"author\":null,\"cities\":{\"Brussels\":[{\"to\":\"London\",\"distance\":322},{\"to\":\"Paris\",\"distance\":265},{\"to\":\"Amsterdam\",\"distance\":173}]}}"
If there are any basic guides to do this, then please let me know, the web is awash with odd examples, out of date instruction based on XSLT 1.0/2.0 or hard to understand pdfs discussing more in depth scenarios.
The function you use already gives you a string with the JSON (see https://www.w3.org/TR/xpath-functions-31/#func-xml-to-json) so if you want to write that to a file just use <xsl:output method="text"/>.
The json output method mainly makes sense if you construct XDM/XPath 3.1 maps/arrays and want to serialize them as JSON.
For your sample I would also use <xsl:template name="xsl:initial-template"> instead of <xsl:template match="/">, then you don't need to provide any dummy input XML at all but can just start with that default named template using e.g. -it from the command line or callTemplate(null, ..) from the API.

Azure Logic App ignores indent="yes" when transforming XML

I'm using Azure Logic App to transform a CSV file to XML, everything was initially set up in BizTalk first to generate the relevant XSDs and XSL which worked perfectly fine. But when I use Azure Logic App the output XML file is all in one line even though I made sure it has indent="yes" in the XSL file.
I know I can use notepad++ to pretty print the result and save the file, but surely there's a way to automatically do that in Logic App?
For those interested, I've found a setting within the Logic App, simply select Apply XSLT output attributes and that's it, no validation needed either!
I manage to get indentation when using XSLT 3.0 with e.g. the stylesheet/map doing
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
expand-text="yes">
<xsl:output method="xml" indent="yes"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:next-match/>
<xsl:comment xmlns:saxon="http://saxon.sf.net/">Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
</xsl:template>
</xsl:stylesheet>
then a request of e.g.
<root><item>a</item><item>b</item></root>
is transformed to the output
<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>a</item>
<item>b</item>
</root>
<!--Run with SAXON HE 9.8.0.8 -->
I don't know how they run the XSLT 1.0 processor to ignore the xsl:output settings, seems a flaw or quirk in the pipeline.

How to exclude XSD elements from documentation [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have an XSD that I'm editing, and, using a tool such as XMLSpy or oXygen, I'd like to generate user documentation for the XSD. However, I'd like to exclude certain elements from the documentation (based on user requirements). What would be the best approach to do this?
There's no tool-independent way to exclude elements from documentation.
There's xsd:annotation/xsd:appinfo for application-oriented data.
There's xsd:annotation/xsd:documentation for human-oriented data.
XSDs are XML, so writing an XSLT transformation is certainly one way to generate documentation. Note, however, that the semantics of XSD are complex enough that the exclusion of any given element will be trivial relative to the rest of the task.
Removing elements from documentation is a three-part process:
1. Add an attribute to the the elements to indicate whether or not they should be documented.
Following is a code sample showing an XSD with three elements, and a new attribute, generateDocumentation, that indicates whether the element should be documented.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="documentation.xslt"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:mc="http://www.mycompany.com"
xsi:schemaLocation="http://www.mycompany.com ./doc.xsd"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="DocumentedElement" mc:generateDocumentation="true"/>
<xs:element name="UndocumentedElement" mc:generateDocumentation="false"/>
<xs:element name="DefaultElement"/>
</xs:schema>
Specifics on how to extend XSD with custom attributes can be found here.
Note that, in this example, elements without the generateDocumentation attribute defined will be documented by default.
2. Apply a transformation to remove elements with attribute values indicating that they should not be documented.
The following XSLT will remove elements that have mc:generateDocumentation="false" or mc:generateDocumentation="0", and will strip the resulting white space:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:mc="http://www.mycompany.com">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<!-- Identity transform -->
<xsl:template match="/ | #* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<!-- Undocumented elements -->
<xsl:template match="*[#mc:generateDocumentation='false'] | *[#mc:generateDocumentation='0']"/>
<!-- Strip white space -->
<xsl:template match="*/text()[normalize-space()]">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
<xsl:template match="*/text()[not(normalize-space())]"/>
</xsl:stylesheet>
This transformation produces the XSD with the specified elements removed:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="documentation.xslt"?>
<xs:schema xmlns:mc="http://www.mycompany.com"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.mycompany.com ./doc.xsd"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="DocumentedElement" nc:generateDocumentation="true"/>
<xs:element name="DefaultElement"/>
</xs:schema>
3. Use the tool of your choice to generate the user documentation.

Use XML schema to specify default namespace in XML instance

I'm not sure if it matters, but I'm using BizTalk 2009 to generate the XML.
Is there a way to specify in my XML schema that the generated XML instance should use the target namespace as the default namespace?
If I have an xsd file like this:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://example.com/">
<xs:element name="example">
<xs:complexType>
<xs:attribute name="value" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:schema>
It creates an XML file like this:
<ns0:example value="something" xmlns:ns0="http://example.com/" />
But I want it to create an XML file like this:
<example value="something" xmlns="http://example.com/" />
I know that they're technically equivalent, but the consumers (vendor APIs) are poorly implemented and I'd like to give them what they expect.
I would expect that it depends on the software generating the instance, not the schema. XSD Schema was developed for validation of XML instances against a schema, not for generating instances from it. So it is unlikely to be present explicitly in XMLSchema. The generating tools might, however, use the fact that elements were (un)qualified
elementFormDefault="(un)qualified"
to trigger the prefixing.
Not completely in scope, but the following is worth reading for schema design: http://www.xfront.com/HideVersusExpose.html
One way would be to define a schema without the namespace. Map the BizTalk schema to the newly defined schema without namespace. From a BizTalk viewpoint, you would have a schema which represents the actual contract with the consumers. (i.e. without namespaces) Also, BizTalk uses namespace#rootnodename to define messageTypes. In this example, you would have two schemas
somenamespace#somerootnodename
#somerootnodename
The possible drawbacks of this approach are this would limiting the usage of this schema (#rootnodename) to 1 instance with the BizTalk group.
This is the default behaviour of BizTalk working this XML schemas and, as far as I know, there is no builtin way to change this.
What you really want, however, is that outbound messages conform to a cleaner and more liberal format than what is used by BizTalk. You can do this by using a custom pipeline component (and a custom send pipeline) to process the outgoing message before it leaves BizTalk.
The idea is to change the namespace prefix as part of sending the message outside BizTalk. The transformation happens during the processing of the send pipeline.
Nic Barden has blogged and provided some source code about this here. You can use his sample as the basis for performing replacement of namespace prefixes, rather than replacing namespaces themselves.
I strongly encourage you to check out the whole series of posts he's done about Developing Streaming Pipeline Components. Nic has made an extensive and thorough job of describing all that's needed to author robust and enterprise-class pipeline components.
Part 1
Part 2
Part 3
Part 4
Part 5
The ns0 prefix is added whenever a BizTalk btm maps a message. It shouldn't matter as this is still valid xml, however this could be a problem when sending messages to partners with legacy or incomplete xml parsers.
You can remove the ns0 prefix and instead make ns0 your default namespace on the output message by changing your btm from a visual map to an .xslt map.
e.g. Once you have converted your map to xslt, change the xslt from:
<?xml version="1.0" encoding="utf-16"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="msxsl s0"
version="1.0"
xmlns:ns0="http://targetns"
xmlns:s0="http://sourcens"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output omit-xml-declaration="yes" method="xml" version="1.0" />
<xsl:template match="/">
<xsl:apply-templates select="s0:FromRoot" />
</xsl:template>
<xsl:template match="s0:FromRoot">
<ns0:ToRoot>
<xsl:for-each select="s0:FromElement">
<ns0:ToElement>
<xsl:value-of select="text()"/>
</ns0:ToElement>
</xsl:for-each>
</ns0:ToRoot>
</xsl:template>
</xsl:stylesheet>
To:
<?xml version="1.0" encoding="utf-16"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:var="http://schemas.microsoft.com/BizTalk/2003/var"
exclude-result-prefixes="msxsl s0"
version="1.0"
xmlns="http://targetns"
xmlns:s0="http://sourcens"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output omit-xml-declaration="yes" method="xml" version="1.0" />
<xsl:template match="/">
<xsl:apply-templates select="s0:FromRoot" />
</xsl:template>
<xsl:template match="s0:FromRoot">
<ToRoot>
<xsl:for-each select="s0:FromElement">
<ToElement>
<xsl:value-of select="text()"/>
</ToElement>
</xsl:for-each>
</ToRoot>
</xsl:template>
</xsl:stylesheet>
i.e. change the default xmlns and then remove the ns0 prefixes automatically.
A more generic solution is also possible (e.g. similar to Firras' answer here), which could be useful e.g. to place as a send port map to strip out all prefixes from elements. However, one needs to be wary if there are more than one xmlns on the output message!

Resources