Excel VBA xml parsing - Loop through node which has specific attribute - excel

I have an XML file which is of below structure:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<BordereauxItem>
<BordereauxMonth>Sep</BordereauxMonth>
<BordereauxRef>2017-09-27</BordereauxRef>
<EclipseBinderNumber>132</EclipseBinderNumber>
<CertificateNumber>100</CertificateNumber>
</BordereauxItem>
<BordereauxItem>
<BordereauxMonth>aUG</BordereauxMonth>
<BordereauxRef>2017-09-27</BordereauxRef>
<EclipseBinderNumber>142</EclipseBinderNumber>
<CertificateNumber>200</CertificateNumber>
</BordereauxItem>
I just want to loop the node(BordereauxItem) for which the attribute(CertificateNumber = 200).
Currently I am able to loop all the nodes in the XML file, but not able to loop the particular node for which CertificateNumber = 200
As I am spending a couple of days in this without moving ahead, can anyone please help me solve this issue?

Here's simple code to loop throguh nodes and check value of one of child nodes:
Sub LoopNodes()
Dim xml As String, node As String, childNode As String, searchStartIndex As Long, index As Long
searchStartIndex = 0
xml = "<?xml version=""1.0"" encoding=""UTF-8"" standalone=""no""?><BordereauxItem><BordereauxMonth>Sep</BordereauxMonth><BordereauxRef>2017-09-27</BordereauxRef><EclipseBinderNumber>132</EclipseBinderNumber><CertificateNumber>100</CertificateNumber></BordereauxItem><BordereauxItem><BordereauxMonth>aUG</BordereauxMonth><BordereauxRef>2017-09-27</BordereauxRef><EclipseBinderNumber>142</EclipseBinderNumber><CertificateNumber>200</CertificateNumber></BordereauxItem>"
node = "BordereauxItem"
childNode = "CertificateNumber"
Do While True
'find BordereauxItem node
searchStartIndex = InStr(searchStartIndex + 1, xml, "<" + node + ">")
If searchStartIndex = 0 Then
MsgBox "No more nodes!"
Exit Do
End If
'find CertificateNumber child node
index = InStr(searchStartIndex, xml, "<" + childNode + ">") + 19
'extract value and check it
If Mid(xml, index, InStr(index, xml, "</" + childNode + ">") - index) = "200" Then
'display node which meets requirements
MsgBox Mid(xml, searchStartIndex, InStr(searchStartIndex + 1, xml, "</" + node + ">") + 17 - searchStartIndex)
End If
Loop
End Sub

Related

Remove (child) node from XML DOM object using VBA (Excel)

I am creating quite complex XML files using a template, replacing special search strings with values which can be entered in an Excel sheet, and then storing the xml-file.
Dim strInpPath As String
Dim strOutpPath As String
Dim fso
Dim f
Dim oDomRd As Object, oNode As Object, i As Long, oAtt As Object, oGroup As Object, oDomWr As Object
Dim oTest As Object
strInpPath = ActiveWorkbook.ActiveSheet.Cells(3, 4).Value
strOutputPath = ActiveWorkbook.ActiveSheet.Cells(4, 4).Value
Set oDomRd = CreateObject("MSXML2.DOMDocument")
oDomRd.Load strInpPath
Set oDomWr = CreateObject("MSXML2.DOMDocument")
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(strOutputPath, 2, True)
Set oGroup = oDomRd.SelectNodes("/")
Set oNode = oGroup.NextNode
If Not (oNode Is Nothing) Then
strout = oNode.XML
strout = ScanTable("_S_AND_R_TABLE_1", strout)
oDomRd.LoadXML (strout)
Set oGroup = oDomRd.SelectNodes("/")
Set oNode = oGroup.NextNode
If oNode.HasChildNodes() Then
Set oLists = oNode.DocumentElement
Run RemoveOptionalEmptyTags(oLists)
End If
strout = oNode.XML
f.write (strout)
Else
strout = "001 error reading file"
End If
MsgBox strout
End Function
Some of the field values are not mandatory so they can be left empty. In this case, the first procedure (scantable) enters "##REMOVE##" as value. In the second step, I want to step through the entire DOMObject and remove the nodes having the value "##REMOVE##"
for this second step I created a procedure:
Public Function RemoveOptionalEmptyTags(ByRef oLists)
For Each listnode In oLists.ChildNodes
If listnode.HasChildNodes() Then
Run RemoveOptionalEmptyTags(listnode)
Else
lcBasename = listnode.ParentNode.BaseName
lcText = listnode.Text
If lcText = "##REMOVE##" Then
listnode.ParentNode.RemoveChild listnode
Exit For
End If
End If
Next listnode
End Function
This works pretty fine, the only problem is, that the node is not removed, it only is empty ():
<Cdtr>
<Nm>Name Creditor</Nm>
<PstlAdr>
<Ctry>DE</Ctry>
<AdrLine>Street</AdrLine>
<AdrLine/>
</PstlAdr>
</Cdtr>
now the question:
How can I completely REMOVE the node, so it would look like this (the second is gone):
<Cdtr>
<Nm>Name Creditor</Nm>
<PstlAdr>
<Ctry>DE</Ctry>
<AdrLine>Street</AdrLine>
</PstlAdr>
</Cdtr>
Basically the RemoveChild syntax is correct:
{NodeToDelete}.ParentNode.RemoveChild {NodeToDelete}
But let's repeat the xml structure and note that each text node (if existant) is regarded as a ChildNode of its parent (i.e. one hierarchy level deeper).
<Cdtr> <!-- 0 documentElement -->
<Nm>Name Creditor</Nm> <!-- 1 ChildNode of Nm = 'Name Creditor' -->
<PstlAdr> <!-- 1 listNode.ParentNode.ParentNode -->
<Ctry>DE</Ctry> <!-- 2 ChildNode of Ctry = 'DE' -->
<AdrLine>Street</AdrLine> <!-- 2 ChildNode of AdrLine[1] = 'Street' -->
<AdrLine> <!-- 2 listNode.ParentNode to be removed -->
<!-- NODETEXT ##REMOVE## --> <!-- 3 ChildNode of AdrLine[2] -->
</AdrLine>
</PstlAdr>
</Cdtr>
Diving down to bottom in xml hierarchy (assuming text values) via
listnode.ParentNode.RemoveChild listnode
you are deleting the textual ChildNode of AdrLine[2] (level 3) which is the string "##REMOVE##",
but not it container node AdrLine[2] (level 2). Therefore you are deleting only the dummy text.
Following your logic in function RemoveOptionalEmptyTags() as close as possible you'd have to code instead:
listNode.ParentNode.ParentNode.RemoveChild listNode.ParentNode
addressing PstlAdr (=level 1) executing a deletion of its ChildNode AdrLine[2] (i.e. at level 2) which
automatically includes deletion of the dummy string "##REMOVE" at level 3.
Related links:
XML Parse via VBA
Obtain atrribute names from xml using VBA

Export from datagridview to excel using XML in vb.net

I am trying to use XML to export the data from a datagridview to an excel file. Below is the code that I have written
Dim fs As New IO.StreamWriter(FileName, False)
With fs
.WriteLine("<?xml version=""1.0""?>")
.WriteLine("<?mso-application progid=""Excel.Sheet""?>")
.WriteLine("<ss:Workbook xmlns=""urn:schemas-microsoft-com:office:spreadsheet"">")
.WriteLine(" <ss:Styles>")
.WriteLine(" <ss:Style ss:ID=""1"">")
.WriteLine(" <ss:Font ss:Bold=""1""/>")
.WriteLine(" <ss:/Style>")
.WriteLine(" <ss:Worksheet ss:Name=""WCRPaymentLog"">")
.WriteLine(" <ss:Table>")
For x As Integer = 0 To dgReport.Columns.Count - 1
.WriteLine(" <ss:Column ss:Width=""{0}""/>", dgReport.Columns.Item(x).Width)
Next
.WriteLine(" <ss:Row ss:StyleID=""1"">")
For x As Integer = 0 To dgReport.Columns.Count - 1
.WriteLine(" <ss:Cell>")
.WriteLine(String.Format(" <ss:Data ss:Type=""String"">{0}</ss:Data>", dgReport.Columns.Item(x).HeaderText.Trim))
.WriteLine(" </ss:Cell>")
Next
.WriteLine(" </ss:Row>")
For intRow As Integer = 0 To dgReport.RowCount - 2
.WriteLine(String.Format(" <ss:Row ss:Height=""{0}"">", dgReport.Rows(intRow).Height))
For intCol As Integer = 0 To dgReport.Columns.Count - 1
.WriteLine(" <ss:Cell>")
.WriteLine(String.Format(" <ss:Data ss:Type=""String"">{0}</ss:Data>", dgReport.Item(intCol, intRow).Value.ToString.Trim))
.WriteLine(" </ss:Cell>")
Next
.WriteLine(" </ss:Row>")
Next
.WriteLine(" </ss:Table>")
.WriteLine(" </ss:Worksheet>")
.WriteLine("</ss:Workbook>")
.Close()
End With
When I am trying to run this code, it executes properly, but while opening the generated file, it gives error: Strict Parse Error.
The error log is generated as below:
XML PARSE ERROR: Undefined namespace
Error occurs at or below this element stack:
(Stack is empty--error occurs at or below top-level element.)
Can any one help me please where I am making mistake.
Also I want to code like this way if the file exists, then It will add a new sheet in the code and write the table in that sheet. Can anyone help me to know how to do it because this code overwrites the existing file.
Try with Xml Linq :
Imports System.Xml
Imports System.Xml.Linq
Public Class Form1
Const FILENAME As String = "c:\temp\test.xml"
Sub New()
' This call is required by the designer.
InitializeComponent()
' Add any initialization after the InitializeComponent() call.
End Sub
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim xml As String = "<?xml version=""1.0""?><?mso-application progid=""Excel.Sheet""?><ss:Workbook xmlns:ss=""abc"" xmlns:urn=""schemas-microsoft-com:office:spreadsheet"">" & _
"<ss:Styles><ss:Style ss:ID=""1""><ss:Font ss:Bold=""1""/></ss:Style></ss:Styles ></ss:Workbook>"
Dim doc As XDocument = XDocument.Parse(xml)
Dim workbook As XElement = doc.Root
Dim ssNs As XNamespace = workbook.GetNamespaceOfPrefix("ss")
Dim worksheet = New XElement(ssNs + "Worksheet", New XAttribute(ssNs + "Name", "WCRPaymentLog"))
workbook.Add(worksheet)
Dim table = New XElement(ssNs + "table")
worksheet.Add(table)
For x As Integer = 0 To dgReport.Columns.Count - 1
Dim column As XElement = New XElement(ssNs + "Column", New XAttribute(ssNs + "Width", dgReport.Columns.Item(x).Width))
TAB()
Next
Dim row As XElement = New XElement(ssNs + "Row", New XAttribute(ssNs + "StyleID", 1))
table.Add(row)
Dim cell As XElement
For x As Integer = 0 To dgReport.Columns.Count - 1
cell = New XElement(ssNs + "Cell", New XElement(ssNs + "Data", New Object() {New XAttribute("Type", "String"), dgReport.Columns.Item(x).HeaderText.Trim}))
row.Add(cell)
Next
For intRow As Integer = 0 To dgReport.RowCount - 2
row = New XElement(ssNs + "Row", New XAttribute(ssNs + "Height", dgReport.Rows(intRow).Height))
table.Add(row)
For intCol As Integer = 0 To dgReport.Columns.Count - 1
cell = New XElement(ssNs + "Cell", New Object() {New XAttribute("Type", "String"), dgReport.Item(intCol, intRow).Value.ToString.Trim})
row.Add(cell)
Next
Next
doc.Save(FILENAME)
End Sub
End Class
It appears that you are trying to create an Excel 2003 SpreadsheetML file.
You indicated that you received this error on opening the file in Excel
XML PARSE ERROR: Undefined namespace Error occurs at or below this element stack: (Stack is empty--error occurs at or below top-level element.)
This leads me to this line:
.WriteLine("<ss:Workbook xmlns=""urn:schemas-microsoft-com:office:spreadsheet"">")
Here you are using the namespace alias "ss:", but it is not defined. Furthermore, that tag should look like the following.
<Workbook xmlns="urn:schemas-microsoft-com:office:Spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:Spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
For more information, see:
OfficeTalk: Dive into SpreadsheetML (Part 1 of 2)
OfficeTalk: Dive into SpreadsheetML (Part 2 of 2)
Edit:
You may also find using VB's Xml Literals to be an easier way to construct the file. The tutorial video [How Do I:] Create Excel Spreadsheets using LINQ to XML? may be beneficial to review.

Why Object.Selectnodes(XPath) gets 1st node value if former node is empty

Why Object.SelectNodes(XPath) gets as 1st node value second node value if former node (real 1st node value) is empty.
Example below:
XML:
<?xml version="1.0" encoding="UTF-8"?>
<Document>
<person>
</person>
<person>
<name>Peter</name>
</person>
</Document>
VBA code:
Dim j as Integer
Dim FileToOpen as Variant
FileToOpen = Application.GetOpenFilename(Filefilter:="XML Files (*.xml), *.xml", _
Title:="Choose XML document ", MultiSelect:=False)
Set XDoc = CreateObject("MSXML2.DOMDocument")
XDoc.async = False: XDoc.validateOnParse = False
XDoc.Load FileToOpen
For j = 1 To 2
Set tofields = XDoc.SelectNodes("//Document/person/name")
If Not (tofields.Item(j)) Is Nothing Then
Debug.Print tofields.Item(j).Text
Else
Debug.Print "Nothing"
End If
Next j
Result:
Peter
Nothing
Why is not "Nothing" on the first place in the result ? How to reach that ? If parent node does not include 1st child node, 1st iteration is omitted.
Thank you.
Enumeration in XMLDOM differs from XPath
XMLDOM syntax enumerates nodes as zero based items in a NodeList, i.e. starting from 0, whereas XPath expressions identifying subnodes start from 1 (e.g. calling a first name item "//name[1]"). This mistakes in considering 2 as the last item index instead of looping from .Item(0) to .Item(1) in your example code. You get the number of found items results via the node list's .Length method (2 minus 1 results in 1 as the last index number, thus giving a name node series of 0 to 1).
Furthermore it's recommended to reference MSXML2 version 6.0 (MSXML2.DOMDocument refers to the last stable version 3.0 used only for compatibility reasons).
Further hints assuming that you want loop through all persons in your XML document (the nodelist in the OP delivers one item only, as the name node exists only once):
The expressions xDoc.SelectNodes("//Document/person") or //person would search the defined node set at any hierarchy Level within a given node structure. So it's less time consuming to use unambigously Set toFields = xDoc.DocumentElement.SelectNodes("person") in your case.
The following code example wouldn't show the Nothing case, as the node list displays two name nodes only (For i = 0 To toFields.Length - 1). Just in order to check your original attempt you could enumerate up to three items by changing intently to For i = 0 To toFields.Length (i.e. 0 to 2).
Additional link
Analyze your XML structure via recursive calls; a working function can be found at Parse XML using XMLDOM.
Code example
Dim xDoc As Object, toFields As Object
Dim myName As String, i As Long
Set xDoc = CreateObject("MSXML2.DOMDocument.6.0") ' recommended version 6.0 (if late bound MSXML2)
xDoc.async = False: xDoc.validateOnParse = False
' ...
Set toFields = xDoc.DocumentElement.SelectNodes("person")
For i = 0 To toFields.Length - 1
If Not toFields.Item(i) Is Nothing Then
If toFields.Item(i).HasChildNodes Then
myName = toFields.Item(i).SelectSingleNode("name").Text
Debug.Print i, IIf(Len(Trim(myName)) = 0, "**Empty name", myName)
Else
Debug.Print i, "**No name node**"
End If
Else
Debug.Print i, "**Nothing**" ' shouldn't be needed from 0 to .Length-1 items :-)
End If
Next i

How to add character in xml element - VBA

Below is the XML Node. I am adding nodes like below..
<ANNEXURE_A>
<ANNX_A/>
</ANNEXURE_A>
But I want it like..
<ANNEXURE_A>
<ANNX_A id="1">
</ANNX_A>
</ANNEXURE_A>
How do I add id = "1" in element.
below is the code I am using:
Set nodElement = docXMLDOM.createElement("ANNEXURE_A")
Set docXMLDOM.DocumentElement = nodElement
For i = 1 To 10
Set nodChild = docXMLDOM.createElement("ANNX_A ID=" & """ & i & """)
nodElement.appendChild nodChild
Next i
docXMLDOM.Save path
Set docXMLDOM = Nothing
But it is throwing error This may not contain '=' character. How to I achieve this ?
You need to create the attribute independently:
For i = 1 To 10
Set nodChild = docXMLDOM.createElement("ANNX_A")
nodChild.setAttribute "ID", i
nodElement.appendChild nodChild
Next i

Load image to a label in userform using vba

I am working on a userform to load Images over labels, I am getting a
Run time error '75': Path/File access error
with below code,
dim solArr as variant
solArr = Split("1.jpg,2.jpg,3.jpg",",")
For i = LBound(solArr) To UBound(solArr)
'For rating image
Dim ratingImageName As String
ratingImageName = "D:\somepath" & "\" & solArr(i)
Set imageStar = UserForm1.Frame3.Controls.Add("Forms.Label.1")
imageStar.Top = 40 + (i * 50)
imageStar.Left = 420
imageStar.Height = 20
imageStar.Width = 100
imageStar.Picture = LoadPicture(ratingImageName)
Next
But, if i use ratingImageName as "D:\Somepath\1.jpg" no error is recieved...
Is there a better way to do it?
Hmmm.. solArr = Array("1.jpg","2.jpg","3.jpg")
I was picking up a value from cell as
1.jpg
2.jpg
3.jpg
the sentence replace(arrSol(i),chr(10),"") solved the problem.
Set imageStar = UserForm1.Frame3.Controls.Add("Forms.Label.1")
I have an array of many items in-game. Example item1, item2, item3... How to change index at item (Exemple item & i) and add a picture it item in label in Form.

Resources