Cross-Links Creation using xslt - xslt-3.0

I would like to give cross-links in the below XML using XSLT 3.0.
My Input XML is:
<?xml version="1.0"?>
<book id="bk1">
<p>The heterogeneity of patients, various clinical manifestations and the dynamics of CS development cause problems with identifying its unified definition. However, CS can be usually diagnosed on the basis of clinical criteria which are easy to assess without the need for advanced hemodynamic monitoring Thiele et al., 2015. Increasing knowledge about (Perkins-Porras et al., 2009) patient characteristics and better understanding of the CS pathophysiology encourages researchers and clinicians to revise the classic definition. (Thiele et al., 2015; Werdan et al., 2012; Nadziakiewicz et al., 2007; Sobanski et al., 2010; Goldberg et al., 2009; Harjola et al., 2015; Holmes et al., 1995).</p>
</book>
Expected Output XML is:
<?xml version="1.0"?>
<book id="bk1">
<p>The heterogeneity of patients, various clinical manifestations and the dynamics of CS development cause problems with identifying its unified definition. However, CS can be usually diagnosed on the basis of clinical criteria which are easy to assess without the need for advanced hemodynamic monitoring Thiele et al., 2015. Increasing knowledge about (Perkins-Porras et al., 2009) patient Perkins-Porras, 2019 characteristics and better understanding of the CS pathophysiology encourages researchers and clinicians to revise the classic definition. (Thiele et al., 2015; Werdan et al., 2012; Nadziakiewicz et al., 2007; Sobanski et al., 2010; Goldberg et al., 2009; Harjola et al., 2015).</p>
</book>
If author name with et al or author name without et al also need to check and give tag. How to achieve this?

Matching on "names" is always difficult so the following might work for some examples but not for others:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="author-pattern" as="xs:string">(\p{Lu}[\p{L}-]+)( et al.)?, ([0-9]{4})</xsl:param>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="p//text()">
<xsl:apply-templates select="analyze-string(., $author-pattern)" mode="wrap-authors"/>
</xsl:template>
<xsl:template match="fn:match" mode="wrap-authors">
<a href="#bib{fn:group[#nr = 1]}{fn:group[#nr = 3]}">
<xsl:apply-templates mode="#current"/>
</a>
</xsl:template>
</xsl:stylesheet>

Related

how to get 'excel' new lines in spreadsheetML (MSXSLT)

I'm using ms xslt 1.0 engine.
I want to generate an raw xml output like this:
<Cell ss:StyleID="s27"><Data ss:Type="String">Catchup (Yes),
FVOD(No),
SVOD (No)</Data></Cell>
note the
embedded in the output.
how do I get this in xslt?
If i do this:
<Cell ss:StyleID="s27">
<Data ss:Type="String">
<xsl:text>Catchup (Yes), </xsl:text>
<xsl:text disable-output-escaping="yes"><![CDATA[
]]></xsl:text>
<xsl:text>SVOD (No)</xsl:text>
</Data>
</Cell>
I get this
<ss:Cell ss:StyleID="s27">
<ss:Data ss:Type="String">Catchup (Yes), &#10;SVOD (No)</ss:Data>
</ss:Cell>
which is wrong! (well, not what i want)
If I try
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
I get the same output
If I try the obvious
<xsl:text>Catchup (Yes),
SVOD (No)</xsl:text>
I get
<ss:Data ss:Type="String">Catchup (Yes),
SVOD (No)</ss:Data>
i.e. its a newline.
for others looking at this question, I'm not 100% sure what the question is, let alone the answer, and I'll try to clarify.
it seems that michael.hor257k's answer does work in some contexts.
(so in fact what im trying does work in some context)
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
this works if I hard code the output and run it in the XSLT engine used by VS2022
it doesnt work against my pretty 'vanilla' XSLT C# implementation against XSLTTransform and XSLTCompiledTransform.
I'm also not clear if it works with the VS2022 setup in my production code (which doesnt just hardcode some output, but does appytemplate and nodeset gynasmistics).
If you are really willing to resort to using disable-output-escaping to force the required result, consider the following example:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<Cell>
<xsl:text>Alpha</xsl:text>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:text>Bravo</xsl:text>
<xsl:text disable-output-escaping="yes">&#10;</xsl:text>
<xsl:text>Charlie</xsl:text>
</Cell>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0"?>
<Cell>Alpha
Bravo
Charlie</Cell>
However, this depends on the processor's support for disable-output-escaping. Judging from the examples in your question, the processor that you use does not. You say it's a Microsoft engine - but I do get the wanted result with both MS processors here: https://xsltfiddle.liberty-development.net/6qLZFRw
Note also that disable-output-escaping is relevant only at the output stage. If you're not writing to the output, then it has no effect.
This is not an answer, but a clarification. I believe this question has got nothing to do with xslt, I added excel.
As michael.hor257k said, the two serializations (
and a line-feed) should be equivalent. But when I open the following XML file with Excel
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<Styles>
<Style ss:ID="wrap">
<Alignment ss:Horizontal="Left" ss:Vertical="Top"/>
</Style>
</Styles>
<Worksheet ss:Name="Summary">
<Table>
<Row>
<Cell ss:StyleID="wrap">
<Data ss:Type="String">Catchup (Yes),
FVOD (No),
SVOD (No)</Data>
</Cell>
<Cell ss:StyleID="wrap"><Data ss:Type="String">Catchup (Yes),
FVOD (No),
SVOD (No)</Data>
</Cell>
</Row>
</Table>
</Worksheet>
</Workbook>
the two cells behave differently, although only in the Formula Bar:
There was a follow up question with a marked answer.
how to get 'excel' new lines in spreadsheetML and the behaviour of nodeset() on disable-output-escaping (Saxon xslt 1.0)
Best read that, the summary is.
the issue seems to be well know during specification of XSLT.
the behaviour of "disable-output-escaping="yes"" changes between XSLT specs.
the behaviour may well change between XSLT engines of the same theoretical spec (it certainly does for me between different MSXSLT engines/configs)
"The issue is solvable in the XSLT language; however, you must apply the disable-output-escaping at the output stage and not before, and you must use a processor that supports disable-output-escaping" (michael.hor257k comment), for me the constaint on 'output stage' makes this correct solution unsuitable for my specific usage, but it is correct.
With an excel hat on, I've changed the implementation to export multiple rows rather than try to put multiple lines in a single cell.

how to convert data to CoNLL09?

i have an data for biology but it only know predicate in it's example.
eg:
<example src="PERMUTATE" no="3">
<text>Both RAP1 and 2 are important vaccine candidates because it has been shown that Alanine can block the action of a postulated repressor (Schofield et al., 1986; Harnyuttanakorn et al., 1992; Howard et al., 1998a).</text>
<arg n="0">Alanine</arg>
<arg n="1">the action of a postulated repressor</arg>
</example>
with
<roles>
<role n="0" descr="causer agent
" />
<role n="1" descr="theme (process or entity being stopped)
" />
</roles>
as i know CoNll09 have many role in trainning dataset. Online semantic role labeling model on the internet only support for CoNll format and... which have more info in it's sentence trainning. how can i convert my data to it?
thank you so much.

search and replace an element, using xslt 3, the replacement phrase is the same

while i have as input an xml file like:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description>
</book>
</catalog>
and i try to find the best way to have the following info in a file, or in the xsl itself:
value to search for:
An in-depth look at creating applications with XML.
add location:
on the self
value to search for:
A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.
add location:
on the self
so if i made a comma separated input file, it would look like:
"An in-depth look at creating applications with XML.","on the self"
"A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.","on the self"
i have tried with xslt 2, but i keep getting errors like a sequence of more than one item is not allowed as the value of variable $search_phrase...
Desired output:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>to be checked</description>
<location>on the self</location>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>to be checked</description>
<location>on the self</location>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description>
</book>
</catalog>
Could someone give me an example with xslt-3.0 where probably i could replace the above phrases, and add the needed elements as well, wherever there is a match?
What i need to do:
in the full xml file, there are many records that can have the same description. I also need to make an exact match on the description: The phrase
"An in-depth look at creating applications with XML, authored by ..."
should not be matched. And in my case, i have also a description where the difference is only the case for instance, "an in-depth look at creating applications with XML." should not be also matched. Since in my code i use lowercase, this may also be the problem, but not sure... Whenever there is a match, the location specified along the search term, must be added into the location element, which currently does not exist in any record in the xml.
Here is a suggestion on how to compare the description elements to a sequence of strings passed in as a parameter (but you could of well read it in from a file):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
expand-text="yes"
version="3.0">
<xsl:param name="new" as="xs:string" select='"on the self"'/>
<xsl:param name="replace" as="xs:string" select="'to be checked'"/>
<xsl:param name="search" as="xs:string*"
select='"An in-depth look at creating applications with XML.",
"A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."'/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="description[. = $search]">
<xsl:copy>{$replace}</xsl:copy>
<location>{$new}</location>
</xsl:template>
</xsl:stylesheet>
Works fine at http://xsltfiddle.liberty-development.net/eiQZDbk, but only after editing the sample to have all description data on one line.
If that is not the case then changing the template to
<xsl:template match="description[normalize-space() = $search]">
<xsl:copy>{$replace}</xsl:copy>
<location>{$new}</location>
</xsl:template>
should help: http://xsltfiddle.liberty-development.net/eiQZDbk/1
If you have several terms to relate to each other than some XML format seems be more appropriate to structure the data, so in
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
expand-text="yes"
version="3.0">
<xsl:param name="data-url" as="xs:string" select="'data.xml'"/>
<!-- if you want to load from a file use xsl:param name="replacement-doc" select="doc($data-url)" -->
<xsl:param name="replacement-doc">
<root>
<search>
<term>An in-depth look at creating applications with XML.</term>
<replacement>to be checked</replacement>
<new>on the self</new>
</search>
<search>
<term>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</term>
<replacement>whatelse</replacement>
<new>something</new>
</search>
</root>
</xsl:param>
<xsl:key name="search" match="search" use="term"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="description[key('search', normalize-space(), $replacement-doc)]">
<xsl:variable name="search" select="key('search', normalize-space(), $replacement-doc)"/>
<xsl:copy>{$search/replacement}</xsl:copy>
<location>{$search/new}</location>
</xsl:template>
</xsl:stylesheet>
I have made some suggestion to do that and have adapted the template. Online sample is at http://xsltfiddle.liberty-development.net/eiQZDbk/2. As indicated there in a comment you can adapt that approach to load the data from a separate file instead of keeping it inline in the XSLT.

Azure Cognitive Service Text API TranslateArray Category Usage

Is the "Category" attribute in a request to TranslateArray a pre-defined list or open to specify during the request?
<TranslateArrayRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<AppId xsi:nil="true" />
<From>en</From>
<Options>
<Category xmlns="http://schemas.datacontract.org/2004/07/Microsoft.MT.Web.Service.V2">pets</Category>
<State>0</State>
</Options>
<Texts>
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">dog</string>
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">cat</string>
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">fish</string>
</Texts>
<To>fr</To>
</TranslateArrayRequest>
Yields the following response:
<html>
<body>
<h1>Argument Exception</h1>
<p>Method: TranslateArray()</p>
<p>Parameter: category</p>
<p>Message: Invalid category
Parameter name: category</p>
<code></code>
<p>message id=0243.V2_Rest.TranslateArray.148495FA</p>
</body>
</html>
Definition in API documentation:
Category: A string containing the category (domain) of the
translation. Defaults to general
but it is unclear what other categories there are if this is not a custom field?
Duplicate but reposting response from one in comments incase search shows up for someone:
Agriculture
Animals
Arts & Entertainment
Automotive
Beauty
Business
Chemicals
Clothing
Custom
Education
Electronics
Energy, Water and Utilities
Financials
Fine Arts
Food
Geography, Anthropology
Government
Healthcare
History
Home & Garden
Internet
Language
Law
Literature
Medicine
Military Science
Music
Philosophy
Political Science
Reference
Religion
Science
Shopping
Social Sciences
Society & Culture
Sports
Technology
Telecommunications

Can EXSLT date-and-time functions be used in XSLT 1.0 and processed using browser engine?

My goal: I need to transform a "date of birth" element in XML document to "age" value using XSL stylesheet and generate XHTML page. I am using the web browser (e.g. IE/FF) directly to open the XML document.
I know XSLT 2.0 has built-in date and time functions, but I think no browser currently support this. So, I've been trying to use EXSLT functions instead without success.
Here are my sample test files:
test.xml
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<test>
</test>
test.xsl
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:date="http://exslt.org/dates-and-times">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select="date:date-time()"/>
</xsl:template>
</xsl:stylesheet>
Error on IE8:
Namespace 'http://exslt.org/dates-and-times' does not contain any functions.
Error on FF4:
Error during XSLT transformation: An unknown XPath extension function was called.
Does that mean EXSLT is not supported by major web browsers? Do I have to use XSLT proccessor like SAXON/Xalan? Am I doing something wrong? Is there an alternative way?
Use the EXSLT support matrix as a reference:
The following XSLT processors support date:date-time:
SAXON from Michael Kay (version 6.4.2)
Xalan-J from Apache (version 2.4.D1)
4XSLT, from 4Suite. (version 0.12.0a3)
libxslt from Daniel Veillard et al. (version 1.0.19)
libxslt is used by Chrome, Opera and Safari, but date-time() does not work since EXSLT is disabled:
I don't think it makes sense to add functions piecemeal; after nearly 5 years is there still anything preventing libexslt being included in the build and exsltRegisterAll() being called from registerXSLTExtensions() in XSLTExtensions.cpp?
IE uses MSXML, which has the following support:
MSXML4 provided two great extension functions, ms:format-date() and ms:format-time() to aim at the latter problem, but they are not supported in .NET or MSXML3.
There is no ms:date-time() function, but there is an MSXSL extension.
<?xml version='1.0'?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:ecma ="about:ecma">
<msxsl:script implements-prefix="ecma">
<![CDATA[
function GetCurrentDateTime()
{
var currentTime = new Date();
var month = currentTime.getMonth() + 1;
var day = currentTime.getDate();
var year = currentTime.getFullYear();
return(month + "/" + day + "/" + year);
}
]]>
</msxsl:script>
<xsl:template match="/">
<xsl:value-of select="ecma:GetCurrentDateTime()"/>
</xsl:template>
</xsl:stylesheet>
Firefox uses Transformiix, which has support for EXSLT date-time() since FF6.
References
MDN: EXSLT
EXSLT - date:date-time
XSL Transformations (XSLT) in Mozilla
Test Cases for XSLT support in browsers
Mozilla Bug 603159 - implement exslt-date:date-time()
Webkit Bug 4079 Support EXSLT with libexslt
Mozilla Bug 265254 - support exlst:date
Transformiix: Elements and Functions Available
Building Practical Solutions with EXSLT.NET
Microsoft XPath Extension Functions

Resources