I need to provide a report in Excel format from a Template Excel and an XML file that contains the data (created from SQL request: report_data.xml). But I can't use XPath expression to navigate through an Excel sheet, and I can't select a specific row in the template to duplicate with the data from the report_data.xml
In order to achieve this, I've first "unzipped" the Excel template in order to have access to the individual sheets in .xml format. At the same time I'm setting "source" files that will be used as the default file (eg: source-sheet1.xml, source-sharedstring.xml, ...) to create the new populated files.
I can't find a way to select a specific row in the template to duplicate with the data from the xml.
TemplateSource
Report
I've tested this using Xalan 2.7.2 and Saxon 9.7.0.15 //
XSLT 1.0
source-sheet1.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
mc:Ignorable="x14ac"
xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<dimension ref="A1:G1"/>
<sheetViews>
<sheetView tabSelected="1" workbookViewId="0">
<selection activeCell="B5" sqref="B5"/>
</sheetView>
</sheetViews>
<sheetFormatPr baseColWidth="10" defaultColWidth="9.140625" defaultRowHeight="15" x14ac:dyDescent="0.25"/>
<sheetData>
<row r="1" spans="1:7" x14ac:dyDescent="0.25">
<c r="A1" t="s">
<v>0</v>
</c><c r="B1" t="s">
<v>1</v>
</c><c r="C1" t="s">
<v>2</v>
</c><c r="D1" t="s">
<v>3</v>
</c><c r="E1" t="s">
<v>4</v>
</c><c r="F1" t="s">
<v>5</v>
</c><c r="G1" t="s">
<v>6</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
XSLT:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:math="http://exslt.org/math"
xmlns:set="http://exslt.org/sets"
xmlns:exslt="http://exslt.org/common"
xmlns:redirect="org.apache.xalan.xslt.extensions.Redirect"
xmlns:office="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
extension-element-prefixes="redirect"
exclude-result-prefixes="fo fn math set redirect office r ss">
<xsl:variable name="SrcSheet1" select="document('XLSX/source-sheet1.xml')"/>
<xsl:template match="Report">
<xsl:call-template name="Test1"/>
</xsl:template>
<xsl:template name="Test1">
<xsl:for-each select="$SrcSheet1/Worksheet/sheetData/row">
<xsl:message>
row = <xsl:value-of select="position()"/>
colcount= <xsl:value-of select="count(./c)"/>
</xsl:message>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
report_data.xml:
<?xml version="1.0" encoding="utf-8"?>
<Report attrib1="foo" ...>
<Data attrib1="foo" ...>
<SubData attrib1="foo" ... />
...
</Data>
...
</Report>
I would like to be able to "read" the source-sheet1.xml and copy the rows in it, and the change the values in each columns (when needed; styles and polices depends if it's the 1st , 2nd, ... row for the same data) with the data from report_data.xml
If this is not the way to generate a report in excel from data in xml with different style depending on the "position" of the data in the excel file.
Related
Background
I have an Excel spreadsheet that I have saved as a 2003 XML spread sheet and I have pasted this into a console mode VB.NET program created with VS 2022.
Goal
I want to automate adding some columns using VB.NET.
Observations
I see that MSExcel makes extensive use of XML namespaces:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<Worksheet ss:Name="197 Industry Groups">
<Table ss:ExpandedColumnCount="10" ss:ExpandedRowCount="198" x:FullColumns="1"
x:FullRows="1" ss:DefaultColumnWidth="42" ss:DefaultRowHeight="11.25">
<Row>
<Cell><Data ss:Type="String">Order</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
<Cell><Data ss:Type="String">Symbol</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
</Row>
<Row>
<Cell><Data ss:Type="Number">2</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
<Cell ss:StyleID="s62" ss:HRef="https://marketsmith.investors.com/mstool?Symbol=G1000"><Data
ss:Type="String">G1000</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
Plan
Use the XPath to get a reference to the first row and add a cell to it using this post as a guide
Questions
What namespaces do I need to add to the namespaceManager? What do I use for the second (uri) argument to the AddNameSpace function? Do I need to add an empty namespace?
Why do the values g2 and g1000 get the value Nothing?
Dim g2 = industryGroups.XPathSelectElement("//ss:Workbook/ss:Worksheet/ss:Table/ss:Row[0]", namespaceManager)
Dim g1000 = (From p In industryGroups.Descendants("Table") Select p).FirstOrDefault()
How do I add a new XElement (cell) using the other cells as a pattern (i.e. each cell contains a data with an attribute or in some cases the cell will have an href attribute). Can I use the XML literal feature and embed my computed hrefs, strings and numbers?
Thanks!
Siegfried
The following should be helpful for questions #1 and #2:
Add the following Imports statements:
Imports System.Xml
Imports System.Xml.XPath
Code:
Dim xDoc As XDocument = XDocument.Load(filename)
Debug.WriteLine($"xDoc: {xDoc.ToString()}")
'add namespaces that exist in XML file
Dim nsSS As XNamespace = "urn:schemas-microsoft-com:office:spreadsheet"
'Dim dataItems = From x In xDoc.Descendants(nsSS + "Workbook").Descendants(nsSS + "Worksheet").Descendants(nsSS + "Table").Descendants(nsSS + "Row").Descendants(nsSS + "Cell").Descendants(nsSS + "Data") Select x
'Dim dataItems = From x In xDoc.Descendants("{urn:schemas-microsoft-com:office:spreadsheet}Workbook").Descendants("{urn:schemas-microsoft-com:office:spreadsheet}Worksheet").Descendants("{urn:schemas-microsoft-com:office:spreadsheet}Table").Descendants("{urn:schemas-microsoft-com:office:spreadsheet}Row").Descendants("{urn:schemas-microsoft-com:office:spreadsheet}Cell").Descendants("{urn:schemas-microsoft-com:office:spreadsheet}Data") Select x
Dim dataItems = From x In xDoc.Descendants(nsSS + "Table").Descendants(nsSS + "Row").Descendants(nsSS + "Cell").Descendants(nsSS + "Data") Select x
Debug.WriteLine($"dataItems: {dataItems.Count.ToString()} Type: {dataItems.GetType.ToString()}")
If dataItems IsNot Nothing Then
For Each item In dataItems
Debug.WriteLine($"{item.ToString()}")
Next
End If
Code:
Dim xDoc As XDocument = XDocument.Load(filename)
'Debug.WriteLine($"xDoc: {xDoc.ToString()}")
'add namespaces that exist in XML file
Dim nsMgr = New XmlNamespaceManager(New NameTable())
nsMgr.AddNamespace("", "urn:schemas-microsoft-com:office:spreadsheet")
nsMgr.AddNamespace("o", "urn:schemas-microsoft-com:office:office")
nsMgr.AddNamespace("x", "urn:schemas-microsoft-com:office:excel")
nsMgr.AddNamespace("ss", "urn:schemas-microsoft-com:office:spreadsheet")
nsMgr.AddNamespace("html", "http://www.w3.org/TR/REC-html40")
Dim g2 = xDoc.XPathSelectElement("ss:Workbook/ss:Worksheet/ss:Table", nsMgr)
Debug.WriteLine($"g2: {g2.ToString()}")
Test.xml:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>TestUser</Author>
<LastAuthor>TestUser</LastAuthor>
<Created>2022-04-02T14:35:55Z</Created>
<LastSaved>2022-04-02T14:37:37Z</LastSaved>
<Version>16.00</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<AllowPNG/>
</OfficeDocumentSettings>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
<WindowHeight>5955</WindowHeight>
<WindowWidth>17970</WindowWidth>
<WindowTopX>32767</WindowTopX>
<WindowTopY>32767</WindowTopY>
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Calibri" x:Family="Swiss" ss:Size="11" ss:Color="#000000"/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s62">
<NumberFormat ss:Format="0"/>
</Style>
</Styles>
<Worksheet ss:Name="Sheet1">
<Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="3" x:FullColumns="1"
x:FullRows="1" ss:DefaultRowHeight="15">
<Column ss:StyleID="s62"/>
<Row>
<Cell><Data ss:Type="String">Id</Data></Cell>
<Cell><Data ss:Type="String">FirstName</Data></Cell>
<Cell><Data ss:Type="String">LastName</Data></Cell>
</Row>
<Row>
<Cell><Data ss:Type="Number">1</Data></Cell>
<Cell><Data ss:Type="String">John</Data></Cell>
<Cell><Data ss:Type="String">Smith</Data></Cell>
</Row>
<Row>
<Cell><Data ss:Type="Number">2</Data></Cell>
<Cell><Data ss:Type="String">Bob</Data></Cell>
<Cell><Data ss:Type="String">Seagul</Data></Cell>
</Row>
</Table>
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
<PageSetup>
<Header x:Margin="0.3"/>
<Footer x:Margin="0.3"/>
<PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
</PageSetup>
<Print>
<ValidPrinterInfo/>
<HorizontalResolution>600</HorizontalResolution>
<VerticalResolution>600</VerticalResolution>
</Print>
<Selected/>
<Panes>
<Pane>
<Number>3</Number>
<ActiveRow>2</ActiveRow>
<ActiveCol>2</ActiveCol>
</Pane>
</Panes>
<ProtectObjects>False</ProtectObjects>
<ProtectScenarios>False</ProtectScenarios>
</WorksheetOptions>
</Worksheet>
</Workbook>
Resources:
XDocument containing namespaces
how to use XPath with XDocument?
XDocument Class
I'm using XSLT / XPath to browse some of the XML files you get when you unzip an Excel file. I found a "relationships" file workbook.xml.rels that I don't seem to be able to read, using code similar to that which successfully read the workbook.xml file.
Here's some of the workbook.xml file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
...
<sheets>
<sheet name="Sheet1"
sheetId="2"
r:id="rId1"/>
<sheet name="Test Sheet"
sheetId="1"
r:id="rId2"/>
</sheets>
...
</workbook>
Here's the workbook.xml.rels file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId3"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme"
Target="theme/theme1.xml"/>
<Relationship Id="rId2"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet"
Target="worksheets/sheet2.xml"/>
<Relationship Id="rId1"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet"
Target="worksheets/sheet1.xml"/>
<Relationship Id="rId5"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings"
Target="sharedStrings.xml"/>
<Relationship Id="rId4"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles"
Target="styles.xml"/>
</Relationships>
Here's some of the XSLT:
<?xml version="1.0"?>
<!-- greeting.xsl -->
<xsl:stylesheet
...
<xsl:output method="text"/>
<xsl:variable name="baseDir" select="replace(document-uri(.), '(.*[\\/]xl).*', '$1/')"/>
<xsl:variable name="workbook" select="concat($baseDir, 'workbook.xml')"/>
<xsl:variable name="theSheetId" select="doc($workbook)/workbook/sheets/sheet[matches(#name, 'Test Sheet')]/#r:id"/>
<xsl:variable name="workbook_rels" select="concat($baseDir, '_rels/workbook.xml.rels')"/>
<!-- code to read workbook.xml.rels -->
<xsl:variable name="theSheet" select="doc($workbook_rels)/Relationships/Relationship[matches(#Id, $theSheetId)]/#Target"/>
<xsl:template match="/">
<xsl:text>
baseDir = </xsl:text><xsl:value-of select="$baseDir"/>
<xsl:text>
workbook = </xsl:text><xsl:value-of select="$workbook"/>
<xsl:text>
workbook_rels = </xsl:text><xsl:value-of select="$workbook_rels"/>
<xsl:text>
theSheetId = </xsl:text><xsl:value-of select="$theSheetId"/>
<xsl:text>
theSheet = </xsl:text><xsl:value-of select="$theSheet"/>
<xsl:text>
end</xsl:text>
</xsl:template>
</xsl:stylesheet>
And the output:
baseDir = file:/C:/Training/sandbox/conv_/xl/
workbook = file:/C:/Training/sandbox/conv_/xl/workbook.xml
workbook_rels = file:/C:/Training/sandbox/conv_/xl/_rels/workbook.xml.rels
theSheetId = rId2
theSheet = **<I get nothing here>**
end
You can see that 'theSheetID' variable is correctly set when reading workbook.xml. But when I use that variable to get the corresponding Target value into 'theSheet' variable from workbook.xml.rels, I get nothing. I tried replacing the matches expression with just a number but I still get nothing. Is there a problem from reading this type of file?
Suggestions? Thanks!
The use of matches and replace suggests you are using an XSLT 2 or 3 processor and that way XSLT 2 or 3 where you can certainly declare xpath-default-namespace, you just have to understand you have to change that in the sections that deal with elements from a different namespace e.g. <xsl:variable name="theSheet" select="doc($workbook_rels)/Relationships/Relationship[matches(#Id, $theSheetId)]/#Target" xpath-default-namespace="http://schemas.openxmlformats.org/package/2006/relationships"/>.
Given the samples I would rather use a key <xsl:key name="rel" match="Relationships/Relationship" use="#Id" xpath-default-namespace="http://schemas.openxmlformats.org/package/2006/relationships"/> and then use <xsl:variable name="theSheet" select="key('rel,$theSheetId, doc($workbook_rels))/#Target"/> but the use of xpath-default-namespace to declare the relevant namespace when selecting elements from a particular document is probably what is missing in your XSLT.
How to loop through fault_block and append values to a string/variable without using template in XSLT. fault_block may occur once or twice or n number of times based on validation errors
Desired Output: 11-Invalid ID;22-Invalid Password;.....nn-Error;
<status>
<code>00</code>
<description>Success</description>
<faultblock>
<faultcode>11</faultcode>
<faultdesc>Invalid ID</faultdesc>
</faultblock>
<faultblock>
<faultcode>22</faultcode>
<faultdesc>Invalid Password</faultdesc>
</faultblock>
<faultblock>
<faultcode>nn</faultcode>
<faultdesc>Error</faultdesc>
</faultblock>
</status>
I'm guessing you want to get your output using only a for-each statement instead of using multiple templates.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="//faultblock">
<xsl:value-of select="faultcode"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="faultdesc"/>
<xsl:text>;</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
See it working here: https://xsltfiddle.liberty-development.net/6pS2B71
using xslt 3, i need to take all content elements' values, and move them to the title elements (if the title elements already exist in a record, they need to be appended with a separator like -) i now have inputted my real data, since the below solution does not solve the problem when implemented to something like:
example input:
<data>
<RECORD ID="31365">
<no>25099</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>021999</access>
<col>GS</col>
<call>889</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<title>1 title</title>
<content>1 content</content>
<sj>1956</sj>
</RECORD>
<RECORD ID="31366">
<no>25100</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>022004</access>
<col>GS</col>
<call>8764</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<sj>1956</sj>
<content>1 title</content>
</RECORD>
</data>
expected output:
<data>
<RECORD ID="31365">
<no>25099</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>021999</access>
<col>GS</col>
<call>889</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<title>1 title - 1 content</title>
<sj>1956</sj>
</RECORD>
<RECORD ID="31366">
<no>25100</no>
<seq>0</seq>
<date>2/4/2012</date>
<ver>2/4/2012</ver>
<access>022004</access>
<col>ΓΣ</col>
<call>8764</call>
<pr>0</pr>
<days>0</days>
<stat>0</stat>
<ch>0</ch>
<sj>1956</sj>
<title>1 title</title>
</RECORD>
<data>
with my attempt, i did not manage to move the elements, i just got an empty line where the content element existed, so please add the removal of blank lines in the suggested solution.
i believe the removal of blank lines could be fixed with the use of
<xsl:template match="text()"/>
One way to achieve this is the following template. It uses XSLT-3.0 content value templates.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="true">
<xsl:output method="xml" indent="yes" />
<xsl:mode on-no-match="shallow-copy" />
<xsl:strip-space elements="*" /> <!-- Remove space between elements -->
<xsl:template match="RECORD">
<xsl:copy>
<xsl:copy-of select="#*" />
<title>{title[1]}{if (title[1]) then ' - ' else ''}<xsl:value-of select="content" separator=" " /></title>
<xsl:apply-templates select="node() except (title,content)" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
It's output is as desired.
If you want to separate the <content> elements with a -, too, you can simplify the core <title> expression to
<xsl:value-of select="title|content" separator=" - " />
EDIT:
All I changed was replacing chapter with RECORD, and it's working fine with Saxon-HE 9.9.1.4J. The only difference in the output is that the title element is always at the first position, but that shouldn't matter. I also added a directive to remove space between elements.
I am using Apache POI to convert .doc to .fo using the WordToFoConverter class, I have converted the images in the word file to base64, but how do i append it to the xsl-fo code generated by apache-poi?
Consider the sample fo file generated by Apache-POI-
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="page-page0" page-height="11.0in" page-width="8.5in">
<fo:region-body margin="1.0in 1.0in 1.0in 1.0in"/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:declarations>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="">
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">CA, Inc.</dc:creator>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</fo:declarations>
<fo:page-sequence master-reference="page-page0">
<fo:flow flow-name="xsl-region-body">
<fo:block hyphenate="true" linefeed-treatment="preserve" space-after="10pt" text-align="start" white-space-collapse="false">
***<!--Image link to '0.jpg' can be here-->
<fo:inline font-family="Times New Roman" font-size="11" font-style="normal" font-weight="normal"> </fo:inline>
<!--Image link to '9ab33.png' can be here-->
<fo:leader/>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
How do i insert an image at the * position?
insert the image directly, base64 encoded in the "src" attribute, taking care to mark the appropriate mimetype ... for example for JPEG image:
<fo:external-graphic src="url('....')"/>
Here's a template that can help you understand the structure:
<xsl:template match="encodedImage">
<fo:external-graphic>
<xsl:attribute name="src">
<xsl:text>url('data:</xsl:text>
<xsl:value-of select="attachmentContentType"/>
<xsl:text>;base64,</xsl:text>
<xsl:value-of select="encodedImageBytes"/>
<xsl:text>')</xsl:text>
</xsl:attribute>
</fo:external-graphic>
</xsl:template>