XPath not working properly in Excel VBA with DOMDocument - excel

We have XML data in the format below received from BACS Clearing:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated by Oracle Reports version 10.1.2.3.0 -->
<?xml-stylesheet href="file:///o:/Dev/Development Projects 2014/DP Team Utilities/D-02294 DDI Voucher XML Conversion Tool/DDIVoucherStylesheet.xsl" type="text/xsl" ?>
<VocaDocument xmlns="http://www.voca.com/schemas/messaging" xmlns:msg="http://www.voca.com/schemas/messaging" xmlns:cmn="http://www.voca.com/schemas/common" xmlns:iso="http://www.voca.com/schemas/common/iso" xmlns:env="http://www.voca.com/schemas/envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.voca.com/schemas/messaging http://www.voca.com/schemas/messaging/Voca_AUDDIS_AdviceofDDI_v1.0.xsd">
<Data>
<Document type="AdviceOfDDIReport" created="2014-08-19T00:59:15" schemaVersion="1.0">
<StreamStart>
<Stream>
<AgencyBankParameter>234</AgencyBankParameter>
<BankName>LLOYDS BANK PLC</BankName>
<BankCode>9876</BankCode>
<AgencyBankName>BANK OF CYPRUS UK LTD</AgencyBankName>
<AgencyBankCode>5432</AgencyBankCode>
<StreamCode>01</StreamCode>
<VoucherSortCode>SC998877</VoucherSortCode>
<VoucherAccountNumber>12348765</VoucherAccountNumber>
</Stream>
</StreamStart>
<DDIVouchers>
<Voucher>
<TransactionCode> NEW</TransactionCode>
<OriginatorIdentification><ServiceUserName>A SERVICE NAME </ServiceUserName><ServiceUserNumber>223344</ServiceUserNumber></OriginatorIdentification>
<PayingBankAccount><BankName>A SMALL BANK UK LTD</BankName><AccountName>AN INDIVIDUAL </AccountName><AccountNumber>77553311</AccountNumber><UkSortCode>SC776655</UkSortCode></PayingBankAccount>
<ReferenceNumber>BACS001122 </ReferenceNumber>
<ContactDetails><PhoneNumber>021 223344</PhoneNumber><FaxNumber> </FaxNumber><Address><cmn:AddresseeName>a name</cmn:AddresseeName><cmn:PostalName>a place</cmn:PostalName><cmn:AddressLine>an address</cmn:AddressLine><cmn:TownName>A Town</cmn:TownName><cmn:CountyIdentification> </cmn:CountyIdentification><cmn:CountryName>UNITED KINGDOM</cmn:CountryName><cmn:ZipCode>AA1 2BB</cmn:ZipCode></Address></ContactDetails>
<ProcessingDate>2014-08-19</ProcessingDate>
<BankAccount><FirstLastVoucherCode>FirstLast</FirstLastVoucherCode><AgencyBankCode>7890</AgencyBankCode><SortCode>SC223344</SortCode><AccountNumber>99886655</AccountNumber><TotalVouchers>1</TotalVouchers></BankAccount>
</Voucher>
<Voucher>
...
and when I load the xml into the XPathVisualizer tool it works fine with an XPath expression like this:
VocaDocument/Data/Document/DDIVouchers/Voucher
But when I use the same xpath in VBA in MS Excel to retrieve the values into a worksheet it is not working.
Here is the code I am using in MS Execl VBA:
Dim nodeList As IXMLDOMNodeList
Dim nodeRow As IXMLDOMNode
Dim nodeCell As IXMLDOMNode
Dim rowCount As Integer
Dim cellCount As Integer
Dim rowRange As Range
Dim cellRange As Range
Dim sheet As Worksheet
Dim dom As DOMDocument60
Dim xpathToExtractRow As String
xpathToExtractRow = "VocaDocument/Data/Document/DDIVouchers/Voucher"
' OTHER XPath examples
' xpathToExtractRow = "VocaDocument/Data/Document/StreamStart/Stream/BankName"
' xpathToExtractRow = "VocaDocument/Data/Document/DDIVouchers/Voucher/ContactDetails/Address/cmn:AddresseeName" ' NOTICE cmn namespace!
' xpathToExtractRow = "VocaDocument/Data/Document/DDIVouchers/Voucher/ProcessingDate
Set domIn = New DOMDocument60
domIn.setProperty "SelectionLanguage", "XPath"
domIn.load (Application.GetOpenFilename("XML Files (*.xml), *.xml", , "Please select the xml file"))
Set sheet = ActiveSheet
Set nodeList = domIn.DocumentElement.SelectNodes(xpathToExtractRow)
Set nodeRow = domIn.DocumentElement.SelectSingleNode(xpathToExtractRow) '"/*/Data//StreamStart/Stream/*").nodeName
rowCount = 0
Workbooks.Add
For Each nodeRow In nodeList
rowCount = rowCount + 1
cellCount = 0
For Each nodeCell In nodeRow.ChildNodes
cellCount = cellCount + 1
Set cellRange = sheet.Cells(rowCount, cellCount)
cellRange.Value = nodeCell.Text
Next nodeCell
Next nodeRow
End Sub
so what am I missing, to I need to add namespaces to the DOM Object or something? And if so, whould I add all the namesspaces using xmlDoc.setProperty("SelectionNamespaces", ?
thanks

You need to register the default namespace :
xmlDoc.setProperty "SelectionNamespaces", "xmlns:ns='http://www.voca.com/schemas/messaging'"
Then you need to use the registered namespace prefix at the beginning of all nodes in the scope where default namespace declared :
ns:VocaDocument/ns:Data/ns:Document/ns:DDIVouchers/ns:Voucher
That's because descendant nodes inherit default namespace from ancestor automatically, unless a different default namespace declared at the descendant level, or a prefix that point to different namespace used.

Related

Import XML data using Excel VBA

I'm trying to import specific data from and XML file to an Excel sheet.
The code I'm using is this.
Dim oXMLFile As New DOMDocument60
Dim books As IXMLDOMNodeList
Dim results() As String
Dim i As Integer, booksUBound As Integer
Dim book As IXMLDOMNode, title As IXMLDOMNode, author As IXMLDOMNode
oXMLFile.Load "C:\example.xml"
Set books = oXMLFile.SelectNodes("/OUT_MESSAGE/LINES/OUT_MESSAGE_LINE")
booksUBound = books.Length - 1
ReDim results(booksUBound, 1)
For i = 0 To booksUBound
Set book = books(i)
Set title = book.SelectSingleNode("C00")
If Not title Is Nothing Then results(i, 0) = title.Text
Next
Dim wks As Worksheet
Set wks = ActiveSheet
wks.Range(wks.Cells(1, 1), wks.Cells(books.Length, 2)) = results
Which works with this XML
<?xml version="1.0" encoding="UTF-8"?>
<OUT_MESSAGE>
<LINES>
<OUT_MESSAGE_LINE>
<C00>1231231</C00>
<C01>3213213</C01>
</OUT_MESSAGE_LINE>
<OUT_MESSAGE_LINE>
<C00>1231234</C00>
<C01>3213214</C01>
</OUT_MESSAGE_LINE>
</LINES>
</OUT_MESSAGE>
My problem is that my XML file looks like this.
<?xml version="1.0" encoding="UTF-8"?>
<OUT_MESSAGE xmlns="urn:randomaddress-com:schema:test_out_message" xmlns:xsi="http://www.randomurl.com/123">
<LINES>
<OUT_MESSAGE_LINE>
<C00>1231231</C00>
<C01>3213213</C01>
</OUT_MESSAGE_LINE>
<OUT_MESSAGE_LINE>
<C00>1231234</C00>
<C01>3213214</C01>
</OUT_MESSAGE_LINE>
</LINES>
</OUT_MESSAGE>
Which I originally thought I could simply get to work by replacing
Set books = oXMLFile.SelectNodes("/OUT_MESSAGE/LINES/OUT_MESSAGE_LINE")
With
Set books = oXMLFile.SelectNodes("/OUT_MESSAGE xmlns='urn:randomaddress-com:schema:test_out_message' xmlns:xsi='http://www.randomurl.com/123'/LINES/OUT_MESSAGE_LINE")
But that gives me a runtime error.
If anyone know what changes I have to do to the original code that would be much appreciated.
This worked for me:
Dim xDoc, nodes, oNode
Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")
'Note: added an `x=` to the default namespace so we can reference it later
xDoc.setProperty "SelectionNamespaces", _
"xmlns:x='urn.randomaddress.com.schema.test_out_message'"
xDoc.LoadXML Sheet2.Range("A4").Value 'load XML from sheet
'use the "x" prefix we added above
Set nodes = xDoc.SelectNodes("/x:OUT_MESSAGE/x:LINES/x:OUT_MESSAGE_LINE")
Debug.Print nodes.Length ' = 1
For Each oNode In nodes
Debug.Print oNode.SelectSingleNode("x:C00").nodeTypedValue
Debug.Print oNode.SelectSingleNode("x:OBJSTATE").nodeTypedValue
'etc
Next oNode
using this XML:
<?xml version="1.0"?>
<OUT_MESSAGE xmlns="urn.randomaddress.com.schema.test_out_message"
xmlns:xsi="http://www.randomurl.com/123">
<LINES>
<OUT_MESSAGE_LINE>
<C00>321312</C00>
<C01>12312312</C01>
<OBJSTATE>Posted</OBJSTATE>
<OBJEVENTS>Accept^Reject^</OBJEVENTS>
<STATE>Posted</STATE>
</OUT_MESSAGE_LINE>
</LINES>
</OUT_MESSAGE>

XML and XPath handling in VBA

I'm trying to parse a XML into a spreadsheet using VBA, and for some reason I can't to the node that I want using XPath, here how my XML looks like:
<?xml version="1.0" encoding="UTF-8"?>
<cteProc xmlns="http://www.somesite.com" versao="3.00">
<CTe xmlns="http://www.somesite.com">
<infCte Id="an id" versao="3.00">
<ide>
<cUF>23</cUF>
<cCT>00000557</cCT>
<CFOP>6932</CFOP>
<natOp>some text </natOp>
<mod>57</mod>
</ide>
<compl>
<xObs>TEXT</xObs>
</compl>
</infCte>
</CTe>
</cteProc>
I'm trying to get at least to the ide node, so I can loop over the rest and get the information I want.
My code looks like this:
Public Sub parseXml()
Dim oXMLFile As MSXML2.DOMDocument60
Dim nodes As MSXML2.IXMLDOMNodeList
path2 = "C:\Users\me\Desktop\adoc.xml"
Set oXMLFile = New MSXML2.DOMDocument60
oXMLFile.Load (path2)
Set nodes = oXMLFile.DocumentElement.SelectNodes("/CTe")
So I tried to print the length of the nodes, I get this:
debug.print nodes.length
> 0
if I loop over like this:
Public Sub parseXml()
Dim oXMLFile As MSXML2.DOMDocument60
Dim nodes As MSXML2.IXMLDOMNodeList
Dim node As MSXML2.IXMLDOMNode
path2 = "C:\Users\me\Desktop\adoc.xml"
Set oXMLFile = New MSXML2.DOMDocument60
oXMLFile.Load (path2)
Set nodes = oXMLFile.DocumentElement.ChildNodes
For Each node In nodes
Debug.Print node.BaseName
Next node
I get this:
> CTe
So, If I do a giant loop I can get the information I want, but I think there must be a simpler sulution for this.
Since your XML uses namespaces, XPath also needs to deal with namespaces.
The following works for me using your XML:
Public Sub parseXml()
Dim oXML As MSXML2.DOMDocument60
Dim oNodes As MSXML2.IXMLDOMNodeList
Dim oItem As MSXML2.IXMLDOMNode
Dim path2 As String
path2 = "P:\adoc.xml"
Set oXML = New MSXML2.DOMDocument60
oXML.Load path2
oXML.setProperty "SelectionLanguage", "XPath"
oXML.setProperty "SelectionNamespaces", "xmlns:ssc=""http://www.somesite.com"""
Set oNodes = oXML.DocumentElement.SelectNodes("ssc:CTe")
For Each oItem In oNodes
MsgBox oItem.nodeName
Next
End Sub
There using
oXMLFile.setProperty "SelectionNamespaces", "xmlns:ssc=""http://www.somesite.com"""
I define a prefix ssc for the namespace
http://www.somesite.com.
The scc is my own choice (somesite.com). This prefix is needed for the XPATH in selectNodes method to work properly.
If you don't want defining the namespace, you would must use the local-name() XPath function. For example:
Set oNodes = oXML.DocumentElement.SelectNodes("*[local-name() = 'CTe']")

How can I index the right node on a XML file?

So I was trying to read the nodes and write their fields on an Excel workbook, however I'm having troubles to index a specific field that I want. The XML structure is like:
<root>
<data name="Admin" xml:space="preserve">
<value>Administrador</value>
</data>
</root>
Now the problem is that I had no problem to get the text inside the node but I also wanted to get the text inside the "" right after data name. The VB code is as it follows:
Dim XDoc As Object
Dim myNodes As IXMLDOMNodeList, myChildNodes As IXMLDOMNodeList
Dim myElement As IXMLDOMElement
Dim myNode As IXMLDOMNode, myChildNode As IXMLDOMNode
Dim nNode As Integer
Dim nChildNode As Integer
Set XDoc = CreateObject("MSXML2.DOMDocument")
XDoc.async = False: XDoc.validateOnParse = False
XDoc.load (vFileName)
Set myNodes = XDoc.SelectNodes("//data/value")
If myNodes.Length > 0 Then
For nNode = 0 To myNodes.Length - 1
Set myNode = myNodes(nNode)
Set myChildNodes = myNode.ChildNodes ' Get the children of the first node.
For nChildNode = 0 To myChildNodes.Length - 1
vNode = myChildNodes(nChildNode).Text
vRange2 = "A" + Trim(Str(vLineTAG))
Range(vRange2).Value = vNode
Next nChildNode
Next nNode
Else
'Stuff and all
End If
So here I'm referencing "value" and vNode is getting the Administrador string inside the node above, but when I reference only data, it returns an empty string, the range which receives it is blanc, and the next child node returns what is inside the value node as expected. Don't know what am I missing here...
This should do the work
Dim ElementAttribute As IXMLDOMAttribute
Set ElementAttribute = myNode.Attributes.getNamedItem("name")
Debug.Print ElementAttribute.Text

How to handle and get values from dynamic tags from XML file in VBA macro

I have an XML file with 3 levels. Some of the tags are dynamic.(Please check below XML file)
I have to validate whether the tag "price" is present in all nodes or not and also need to get value of "price tag". I also need to fetch a value of every node present in XML file.
I tried validating whether every node in XML file is present or not but getting an error
VBA code snippet
Function fnReadXMLByTags2()
Dim mainWorkBook As Workbook
Set mainWorkBook = ActiveWorkbook
Dim obj As Object
Set obj = CreateObject("MSXML2.DOMDocument")
obj.async = False
XMLFileName = "C:\Prakash\Demo.xml"
obj.Load (XMLFileName)
Set authorNodes = obj.SelectNodes("//Books/book/author/text()")
Set titleNodes = obj.SelectNodes("//Books/book/title/text()")
Set genreNodes = obj.SelectNodes("//Books/book/genre/text()")
Set priceNodes = obj.SelectNodes("//Books/book/price/text()")
Set publish_dateNodes = obj.SelectNodes("//Books/book/publish_date/text()")
Set languageNodes = obj.SelectNodes("//Books/book/language/text()")
For i = 0 To (authorNodes.Length - 1)
If Not publish_dateNodes Is Nothing Then
Set publish_dateNodesValue = obj.SelectNodes("//Books/book/publish_date/text()")
publish_dateObj = publish_dateNodesValue(i).NodeValue
mainWorkBook.Sheets("Sheet3").Range("E" & i + 2).Value = publish_dateObj
Else
mainWorkBook.Sheets("Sheet3").Range("E" & i + 2).Value = "Blank"
End If
Next
End Function
Below error Message for line …
publish_dateObj = publish_dateNodesValue(i).NodeValue
Runtime error '91':
Object variable or With block variable not set
This is what I'm expecting in Excel file:
Excel sheet output
Below is the xml file :
<?xml version="1.0"?>
<Books>
<book id="1">
<author>ABC</author>
<title>Physics</title>
<genre>asd</genre>
<price>Rs.44</price>
<publish_date>20-10-2001</publish_date>
<description>Book1</description>
</book>
<book id="2">
<author>DEF</author>
<title>Chem</title>
<genre>XYZ</genre>
<publish_date>02-12-2016</publish_date>
<description>Book2</description>
</book>
<book id="3">
<author>GHI</author>
<title>Maths</title>
<genre>ABC</genre>
<price>Rs.500</price>
<language>English</language>
<description>Book3</description>
</book>
</Books>
It should be a procedure (Sub) not a function, because it does not return a value (make sure you know the difference).
The way you read the sub nodes made it impossible to keep the data consitent. You need to read all books first and then for each book the sub nodes:
Set CurrentNode = Books(iBook).SelectNodes("author/text()")(0)
The above reads reads the node author from book number iBook into CurrentNode and if it does not exist then CurrentNode will be Nothing. So you just need to check it and output the .NodeValue.
Note that I introduced an array NodesOutputList to be able to loop through the columns easily instead of repeating similar code over and over.
So you end up with something like:
Option Explicit
Public Sub fnReadXMLByTags2()
Dim mainWorkBook As Workbook
Set mainWorkBook = ThisWorkbook '<-- this is the workbook where this code is in ActiveWorkbook is the worbkook that has focus (is on top)
Dim wsOutput As Worksheet 'define output sheet
Set wsOutput = mainWorkBook.Worksheets("Sheet3")
Dim XMLFileName As String
XMLFileName = "D:\code\Demo.xml"
Dim obj As Object
Set obj = CreateObject("MSXML2.DOMDocument")
obj.async = False
obj.Load XMLFileName '<-- no parenthesis here!!!
Dim Books As Object 'get all books
Set Books = obj.SelectNodes("//Books/book")
Dim NodesOutputList() As Variant 'order in which the nodes are checked and output
NodesOutputList = Array("author/text()", "title/text()", "genre/text()", "price/text()", "publish_date/text()", "language/text()", "description/text()")
Dim iBook As Long
For iBook = 0 To Books.Length - 1 'loop through books
Dim iNode As Long
For iNode = LBound(NodesOutputList) To UBound(NodesOutputList) 'loop through nodes in the list
Dim CurrentNode As Object
Set CurrentNode = Books(iBook).SelectNodes(NodesOutputList(iNode))(0)
If Not CurrentNode Is Nothing Then
wsOutput.Cells(iBook + 2, iNode + 1).Value = CurrentNode.NodeValue
Else
wsOutput.Cells(iBook + 2, iNode + 1).Value = "blank"
End If
Next iNode
Next iBook
End Sub

Application-defined or object-defined error in Excel VBA

I'm getting said error in using VBA in Excel on the following code:
Private Sub XMLGen(mapRangeA, mapRangeB, ticketSize, mapping)
Dim fieldOneArr As Variant
Dim fieldTwoArr As Variant
Dim row As Long
Dim column As Long
Dim infoCol As Long
Dim endInfo As Long
Dim objDom As DOMDocument
Dim objNode As IXMLDOMNode
Dim objXMLRootelement As IXMLDOMElement
Dim objXMLelement As IXMLDOMElement
Dim objXMLattr As IXMLDOMAttribute
Set ws = Worksheets("StockData")
Dim wsName As String
Set objDom = New DOMDocument
If ticketSize = 8 Then
wsName = "A7Tickets"
ElseIf ticketSize = 16 Then
wsName = "A8Tickets"
Else
wsName = "A5Tickets"
End If
Set ps = Worksheets(wsName)
'create processing instruction
Set objNode = objDom.createProcessingInstruction("xml", "version='1.0' encoding='UTF-8'")
objDom.appendChild objNode
'create root element
Set objXMLRootelement = objDom.createElement("fields")
objDom.appendChild objXMLRootelement
'create Attribute to the Field Element and set value
Set objXMLattr = objDom.createAttribute("xmlns:xfdf")
objXMLattr.NodeValue = "http://ns.adobe.com/xfdf-transition/"
objXMLRootelement.setAttributeNode objXMLattr
infoCol = 1
fieldOneArr = Worksheets(mapping).range(mapRangeA)
fieldTwoArr = Worksheets(mapping).range(mapRangeB)
For row = 1 To UBound(fieldOneArr, 1)
For column = 1 To UBound(fieldOneArr, 2)
'create Heading element
Set objXMLelement = objDom.createElement(fieldOneArr(row, column))
objXMLRootelement.appendChild objXMLelement
'create Attribute to the Heading Element and set value
Set objXMLattr = objDom.createAttribute("xfdf:original")
objXMLattr.NodeValue = (fieldTwoArr(row, column))
objXMLelement.setAttributeNode objXMLattr
objXMLelement.Text = ps.Cells(row, infoCol)
infoCol = infoCol + 1
endInfo = endInfo + 1
If endInfo = 4 Then
infoCol = 1
End If
Next column
Next row
'save XML data to a file
If ticketSize = 2 Then
objDom.Save ("C:\ExportTestA5.xml")
MsgBox "A5 XML created"
ElseIf ticketSize = 8 Then
objDom.Save ("C:\ExportTestA7.xml")
MsgBox "A7 XML created"
Else
objDom.Save ("C:\ExportTestA8.xml")
MsgBox "A8 XML created"
End If
End Sub
When I hit debug it points to this line:
fieldOneArr = Worksheets(mapping).range(mapRangeA)
I know that .Range is supposed to be upper case but it keeps on setting it to lower case automatically whenever I correct it.
This code is meant to create an XML file and then write the details from the chosen worksheet (based on the ticketSize variable) into the correct XML fields. Hence I have a mapping worksheet from which I write the field and attribute names, and then write in the info from the correct ticket size worksheet into the text property of the element.
You should define the types of your function parameters, in this case mapRangeA As String. Office object methods and properties are often not very helpful with their error messages, so it's better to have a type mismatch error if you have a problem with a parameter.

Resources