How to handle and get values from dynamic tags from XML file in VBA macro - excel

I have an XML file with 3 levels. Some of the tags are dynamic.(Please check below XML file)
I have to validate whether the tag "price" is present in all nodes or not and also need to get value of "price tag". I also need to fetch a value of every node present in XML file.
I tried validating whether every node in XML file is present or not but getting an error
VBA code snippet
Function fnReadXMLByTags2()
Dim mainWorkBook As Workbook
Set mainWorkBook = ActiveWorkbook
Dim obj As Object
Set obj = CreateObject("MSXML2.DOMDocument")
obj.async = False
XMLFileName = "C:\Prakash\Demo.xml"
obj.Load (XMLFileName)
Set authorNodes = obj.SelectNodes("//Books/book/author/text()")
Set titleNodes = obj.SelectNodes("//Books/book/title/text()")
Set genreNodes = obj.SelectNodes("//Books/book/genre/text()")
Set priceNodes = obj.SelectNodes("//Books/book/price/text()")
Set publish_dateNodes = obj.SelectNodes("//Books/book/publish_date/text()")
Set languageNodes = obj.SelectNodes("//Books/book/language/text()")
For i = 0 To (authorNodes.Length - 1)
If Not publish_dateNodes Is Nothing Then
Set publish_dateNodesValue = obj.SelectNodes("//Books/book/publish_date/text()")
publish_dateObj = publish_dateNodesValue(i).NodeValue
mainWorkBook.Sheets("Sheet3").Range("E" & i + 2).Value = publish_dateObj
Else
mainWorkBook.Sheets("Sheet3").Range("E" & i + 2).Value = "Blank"
End If
Next
End Function
Below error Message for line …
publish_dateObj = publish_dateNodesValue(i).NodeValue
Runtime error '91':
Object variable or With block variable not set
This is what I'm expecting in Excel file:
Excel sheet output
Below is the xml file :
<?xml version="1.0"?>
<Books>
<book id="1">
<author>ABC</author>
<title>Physics</title>
<genre>asd</genre>
<price>Rs.44</price>
<publish_date>20-10-2001</publish_date>
<description>Book1</description>
</book>
<book id="2">
<author>DEF</author>
<title>Chem</title>
<genre>XYZ</genre>
<publish_date>02-12-2016</publish_date>
<description>Book2</description>
</book>
<book id="3">
<author>GHI</author>
<title>Maths</title>
<genre>ABC</genre>
<price>Rs.500</price>
<language>English</language>
<description>Book3</description>
</book>
</Books>

It should be a procedure (Sub) not a function, because it does not return a value (make sure you know the difference).
The way you read the sub nodes made it impossible to keep the data consitent. You need to read all books first and then for each book the sub nodes:
Set CurrentNode = Books(iBook).SelectNodes("author/text()")(0)
The above reads reads the node author from book number iBook into CurrentNode and if it does not exist then CurrentNode will be Nothing. So you just need to check it and output the .NodeValue.
Note that I introduced an array NodesOutputList to be able to loop through the columns easily instead of repeating similar code over and over.
So you end up with something like:
Option Explicit
Public Sub fnReadXMLByTags2()
Dim mainWorkBook As Workbook
Set mainWorkBook = ThisWorkbook '<-- this is the workbook where this code is in ActiveWorkbook is the worbkook that has focus (is on top)
Dim wsOutput As Worksheet 'define output sheet
Set wsOutput = mainWorkBook.Worksheets("Sheet3")
Dim XMLFileName As String
XMLFileName = "D:\code\Demo.xml"
Dim obj As Object
Set obj = CreateObject("MSXML2.DOMDocument")
obj.async = False
obj.Load XMLFileName '<-- no parenthesis here!!!
Dim Books As Object 'get all books
Set Books = obj.SelectNodes("//Books/book")
Dim NodesOutputList() As Variant 'order in which the nodes are checked and output
NodesOutputList = Array("author/text()", "title/text()", "genre/text()", "price/text()", "publish_date/text()", "language/text()", "description/text()")
Dim iBook As Long
For iBook = 0 To Books.Length - 1 'loop through books
Dim iNode As Long
For iNode = LBound(NodesOutputList) To UBound(NodesOutputList) 'loop through nodes in the list
Dim CurrentNode As Object
Set CurrentNode = Books(iBook).SelectNodes(NodesOutputList(iNode))(0)
If Not CurrentNode Is Nothing Then
wsOutput.Cells(iBook + 2, iNode + 1).Value = CurrentNode.NodeValue
Else
wsOutput.Cells(iBook + 2, iNode + 1).Value = "blank"
End If
Next iNode
Next iBook
End Sub

Related

How can I index the right node on a XML file?

So I was trying to read the nodes and write their fields on an Excel workbook, however I'm having troubles to index a specific field that I want. The XML structure is like:
<root>
<data name="Admin" xml:space="preserve">
<value>Administrador</value>
</data>
</root>
Now the problem is that I had no problem to get the text inside the node but I also wanted to get the text inside the "" right after data name. The VB code is as it follows:
Dim XDoc As Object
Dim myNodes As IXMLDOMNodeList, myChildNodes As IXMLDOMNodeList
Dim myElement As IXMLDOMElement
Dim myNode As IXMLDOMNode, myChildNode As IXMLDOMNode
Dim nNode As Integer
Dim nChildNode As Integer
Set XDoc = CreateObject("MSXML2.DOMDocument")
XDoc.async = False: XDoc.validateOnParse = False
XDoc.load (vFileName)
Set myNodes = XDoc.SelectNodes("//data/value")
If myNodes.Length > 0 Then
For nNode = 0 To myNodes.Length - 1
Set myNode = myNodes(nNode)
Set myChildNodes = myNode.ChildNodes ' Get the children of the first node.
For nChildNode = 0 To myChildNodes.Length - 1
vNode = myChildNodes(nChildNode).Text
vRange2 = "A" + Trim(Str(vLineTAG))
Range(vRange2).Value = vNode
Next nChildNode
Next nNode
Else
'Stuff and all
End If
So here I'm referencing "value" and vNode is getting the Administrador string inside the node above, but when I reference only data, it returns an empty string, the range which receives it is blanc, and the next child node returns what is inside the value node as expected. Don't know what am I missing here...
This should do the work
Dim ElementAttribute As IXMLDOMAttribute
Set ElementAttribute = myNode.Attributes.getNamedItem("name")
Debug.Print ElementAttribute.Text

VBA to extract file information, add any new information after last row of data

Sub GetFileList()
Dim xFSO As Object
Dim xFolder As Object
Dim xFile As Object
Dim objOL As Object
Dim Msg As Object
Dim xPath As String
Dim thisFile As String
Dim i As Integer
Dim lastrow As Long
xPath = Sheets("UI").Range("D7")
Set xFSO = CreateObject("Scripting.FileSystemObject")
Set xFolder = xFSO.GetFolder(xPath)
i = 1
For Each xFile In xFolder.Files
i = i + 1
Worksheets("Info").Cells(i, 1) = xPath
Worksheets("Info").Cells(i, 2) = Left(xFile.Name, InStrRev(xFile.Name, ".") - 1)
Worksheets("Info").Cells(i, 3) = Mid(xFile.Name, InStrRev(xFile.Name, ".") + 1)
Worksheets("Info").Cells(i, 6) = Left(FileDateTime(xFile), InStrRev(FileDateTime(xFile), " ") - 1)
Next
Set Msg = Nothing
Worksheets("Info").Visible = True
Worksheets("Info").Activate
End Sub
The code to extract file information from a folder. The issue is when I change the folder path, it overwrites on the previously fetched data.
Sheet -UI is where the sub executed on press of button, Sheet Info is the place where the data needs to be pasted.
How to write the code to add a new row of data after the data which is already available. If the sheet is blank then add data from the 1st ROW otherwise add data from the LAST ROW.
Sheets("UI").Range("A1").End(xlDown).Select
i = Selection.Row + 1
Try replacing
i = 1
with
i = Worksheets("Info").UsedRange.Rows.Count + 1
This will set i to 1 the first time around, and to the first free row ever after. New data will be added below the existing data, if there is any.

How to name a worksheet?

I have a file (F) that contains several workbooks, each workbook has the same format. I do a conditional sum on each of the workbook under column conditions. I want to put the output within another workbook that contains one worksheet per workbook looped (contained within F).
I cannot find the good strategy to change the worksheet name in function of the looped workbook' name.
Set Output_tot_n = Workbooks("Final_Output").Sheet_name.Range("B7")
I got
Error 438 "Object doesn't support this property or method"
The whole code:
Sub Proceed_Data()
Dim FileSystemObj As Object
Dim FolderObj As Object
Dim fileobj As Object
Dim Sheet_name As Worksheet
Dim i, j, k As Integer
Dim wb As Workbook
Set FileSystemObj = CreateObject("Scripting.FileSystemObject")
Set FolderObj = FileSystemObj.GetFolder("C:\...\")
For Each fileobj In FolderObj.Files
Set wb = Workbooks.Open(fileobj.Path)
Set Output_tot_n = Workbooks("Final_Output").Sheet_name.Range("B7")
If wb.Name = "AAA_SEPT_2018" Then
Sheet_name = Worksheets("AAA")
End If
If wb.Name = "BBB_SEPT_2018" Then
Sheet_name = Worksheets("BBB")
End If
If wb.Name = "CCC_SEPT_2018" Then
Sheet_name = Worksheets("CCC")
End If
' conditional sum
With wb.Sheets("REPORT")
For i = 2 To .Cells(Rows.Count, 14).End(xlUp).Row
If .Cells(i, "O").Value = "sept" Then
k = .Cells(i, "M").Value
End If
j = j + k
k = 0
Next i
End With
Output_tot_n = j
j = 0
wb.Save
wb.Close
Next fileobj
End Sub
Workbooks is a collection (part of the actual Application-object). A collection in VBA can be accessed either by index number (index starts at 1) or by name. The name of an open Workbook is the name including the extension, in your case probably either Final_Output.xlsx or Final_Output.xlsm.
Sheets and Worksheets are collections within a Workbook, again accessed via index or name (the difference is that Worksheets contains "real" spreadsheets while Sheets may also contain other sheet types, eg charts).
So in your case, you want to access a Range of a specific sheet of a specific workbook. The workbook has a fixed name, while the sheet name is stored in a variable. You can write for example
dim sheetName as string, sheet as Worksheet, Output_tot_n as Range
sheetName = "AAA" ' (put your logic here)
set sheet = Workbooks("Final_Output.xlsm").Worksheets(Sheet_name)
set Output_tot_n = sheet.Range("B7")
or put all together (depending on your needs)
set Output_tot_n = Workbooks("Final_Output.xlsm").Worksheets(Sheet_name).Range("B7")
No it actually works. Thank you again for your answers.
the problem was just is important to put "AAA_SEPT_2018.xlsx"

XPath not working properly in Excel VBA with DOMDocument

We have XML data in the format below received from BACS Clearing:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated by Oracle Reports version 10.1.2.3.0 -->
<?xml-stylesheet href="file:///o:/Dev/Development Projects 2014/DP Team Utilities/D-02294 DDI Voucher XML Conversion Tool/DDIVoucherStylesheet.xsl" type="text/xsl" ?>
<VocaDocument xmlns="http://www.voca.com/schemas/messaging" xmlns:msg="http://www.voca.com/schemas/messaging" xmlns:cmn="http://www.voca.com/schemas/common" xmlns:iso="http://www.voca.com/schemas/common/iso" xmlns:env="http://www.voca.com/schemas/envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.voca.com/schemas/messaging http://www.voca.com/schemas/messaging/Voca_AUDDIS_AdviceofDDI_v1.0.xsd">
<Data>
<Document type="AdviceOfDDIReport" created="2014-08-19T00:59:15" schemaVersion="1.0">
<StreamStart>
<Stream>
<AgencyBankParameter>234</AgencyBankParameter>
<BankName>LLOYDS BANK PLC</BankName>
<BankCode>9876</BankCode>
<AgencyBankName>BANK OF CYPRUS UK LTD</AgencyBankName>
<AgencyBankCode>5432</AgencyBankCode>
<StreamCode>01</StreamCode>
<VoucherSortCode>SC998877</VoucherSortCode>
<VoucherAccountNumber>12348765</VoucherAccountNumber>
</Stream>
</StreamStart>
<DDIVouchers>
<Voucher>
<TransactionCode> NEW</TransactionCode>
<OriginatorIdentification><ServiceUserName>A SERVICE NAME </ServiceUserName><ServiceUserNumber>223344</ServiceUserNumber></OriginatorIdentification>
<PayingBankAccount><BankName>A SMALL BANK UK LTD</BankName><AccountName>AN INDIVIDUAL </AccountName><AccountNumber>77553311</AccountNumber><UkSortCode>SC776655</UkSortCode></PayingBankAccount>
<ReferenceNumber>BACS001122 </ReferenceNumber>
<ContactDetails><PhoneNumber>021 223344</PhoneNumber><FaxNumber> </FaxNumber><Address><cmn:AddresseeName>a name</cmn:AddresseeName><cmn:PostalName>a place</cmn:PostalName><cmn:AddressLine>an address</cmn:AddressLine><cmn:TownName>A Town</cmn:TownName><cmn:CountyIdentification> </cmn:CountyIdentification><cmn:CountryName>UNITED KINGDOM</cmn:CountryName><cmn:ZipCode>AA1 2BB</cmn:ZipCode></Address></ContactDetails>
<ProcessingDate>2014-08-19</ProcessingDate>
<BankAccount><FirstLastVoucherCode>FirstLast</FirstLastVoucherCode><AgencyBankCode>7890</AgencyBankCode><SortCode>SC223344</SortCode><AccountNumber>99886655</AccountNumber><TotalVouchers>1</TotalVouchers></BankAccount>
</Voucher>
<Voucher>
...
and when I load the xml into the XPathVisualizer tool it works fine with an XPath expression like this:
VocaDocument/Data/Document/DDIVouchers/Voucher
But when I use the same xpath in VBA in MS Excel to retrieve the values into a worksheet it is not working.
Here is the code I am using in MS Execl VBA:
Dim nodeList As IXMLDOMNodeList
Dim nodeRow As IXMLDOMNode
Dim nodeCell As IXMLDOMNode
Dim rowCount As Integer
Dim cellCount As Integer
Dim rowRange As Range
Dim cellRange As Range
Dim sheet As Worksheet
Dim dom As DOMDocument60
Dim xpathToExtractRow As String
xpathToExtractRow = "VocaDocument/Data/Document/DDIVouchers/Voucher"
' OTHER XPath examples
' xpathToExtractRow = "VocaDocument/Data/Document/StreamStart/Stream/BankName"
' xpathToExtractRow = "VocaDocument/Data/Document/DDIVouchers/Voucher/ContactDetails/Address/cmn:AddresseeName" ' NOTICE cmn namespace!
' xpathToExtractRow = "VocaDocument/Data/Document/DDIVouchers/Voucher/ProcessingDate
Set domIn = New DOMDocument60
domIn.setProperty "SelectionLanguage", "XPath"
domIn.load (Application.GetOpenFilename("XML Files (*.xml), *.xml", , "Please select the xml file"))
Set sheet = ActiveSheet
Set nodeList = domIn.DocumentElement.SelectNodes(xpathToExtractRow)
Set nodeRow = domIn.DocumentElement.SelectSingleNode(xpathToExtractRow) '"/*/Data//StreamStart/Stream/*").nodeName
rowCount = 0
Workbooks.Add
For Each nodeRow In nodeList
rowCount = rowCount + 1
cellCount = 0
For Each nodeCell In nodeRow.ChildNodes
cellCount = cellCount + 1
Set cellRange = sheet.Cells(rowCount, cellCount)
cellRange.Value = nodeCell.Text
Next nodeCell
Next nodeRow
End Sub
so what am I missing, to I need to add namespaces to the DOM Object or something? And if so, whould I add all the namesspaces using xmlDoc.setProperty("SelectionNamespaces", ?
thanks
You need to register the default namespace :
xmlDoc.setProperty "SelectionNamespaces", "xmlns:ns='http://www.voca.com/schemas/messaging'"
Then you need to use the registered namespace prefix at the beginning of all nodes in the scope where default namespace declared :
ns:VocaDocument/ns:Data/ns:Document/ns:DDIVouchers/ns:Voucher
That's because descendant nodes inherit default namespace from ancestor automatically, unless a different default namespace declared at the descendant level, or a prefix that point to different namespace used.

Application-defined or object-defined error in Excel VBA

I'm getting said error in using VBA in Excel on the following code:
Private Sub XMLGen(mapRangeA, mapRangeB, ticketSize, mapping)
Dim fieldOneArr As Variant
Dim fieldTwoArr As Variant
Dim row As Long
Dim column As Long
Dim infoCol As Long
Dim endInfo As Long
Dim objDom As DOMDocument
Dim objNode As IXMLDOMNode
Dim objXMLRootelement As IXMLDOMElement
Dim objXMLelement As IXMLDOMElement
Dim objXMLattr As IXMLDOMAttribute
Set ws = Worksheets("StockData")
Dim wsName As String
Set objDom = New DOMDocument
If ticketSize = 8 Then
wsName = "A7Tickets"
ElseIf ticketSize = 16 Then
wsName = "A8Tickets"
Else
wsName = "A5Tickets"
End If
Set ps = Worksheets(wsName)
'create processing instruction
Set objNode = objDom.createProcessingInstruction("xml", "version='1.0' encoding='UTF-8'")
objDom.appendChild objNode
'create root element
Set objXMLRootelement = objDom.createElement("fields")
objDom.appendChild objXMLRootelement
'create Attribute to the Field Element and set value
Set objXMLattr = objDom.createAttribute("xmlns:xfdf")
objXMLattr.NodeValue = "http://ns.adobe.com/xfdf-transition/"
objXMLRootelement.setAttributeNode objXMLattr
infoCol = 1
fieldOneArr = Worksheets(mapping).range(mapRangeA)
fieldTwoArr = Worksheets(mapping).range(mapRangeB)
For row = 1 To UBound(fieldOneArr, 1)
For column = 1 To UBound(fieldOneArr, 2)
'create Heading element
Set objXMLelement = objDom.createElement(fieldOneArr(row, column))
objXMLRootelement.appendChild objXMLelement
'create Attribute to the Heading Element and set value
Set objXMLattr = objDom.createAttribute("xfdf:original")
objXMLattr.NodeValue = (fieldTwoArr(row, column))
objXMLelement.setAttributeNode objXMLattr
objXMLelement.Text = ps.Cells(row, infoCol)
infoCol = infoCol + 1
endInfo = endInfo + 1
If endInfo = 4 Then
infoCol = 1
End If
Next column
Next row
'save XML data to a file
If ticketSize = 2 Then
objDom.Save ("C:\ExportTestA5.xml")
MsgBox "A5 XML created"
ElseIf ticketSize = 8 Then
objDom.Save ("C:\ExportTestA7.xml")
MsgBox "A7 XML created"
Else
objDom.Save ("C:\ExportTestA8.xml")
MsgBox "A8 XML created"
End If
End Sub
When I hit debug it points to this line:
fieldOneArr = Worksheets(mapping).range(mapRangeA)
I know that .Range is supposed to be upper case but it keeps on setting it to lower case automatically whenever I correct it.
This code is meant to create an XML file and then write the details from the chosen worksheet (based on the ticketSize variable) into the correct XML fields. Hence I have a mapping worksheet from which I write the field and attribute names, and then write in the info from the correct ticket size worksheet into the text property of the element.
You should define the types of your function parameters, in this case mapRangeA As String. Office object methods and properties are often not very helpful with their error messages, so it's better to have a type mismatch error if you have a problem with a parameter.

Resources