How to access XML response using Excel VBA - excel

I have an excel spreadsheet where I have 25,000+ records with Lat/Lon coordinates and other data. I am trying to use an Excel VBA script to look up an associated County name based on the Lat/Lon using the following US Census web service link (an example coordinate included).
https://geo.fcc.gov/api/census/block/find?latitude=40.000&longitude=-90.000&format=xml
this returns the following response xml.
<Response status="OK" executionTime="0">
<Block FIPS="170179601002012" bbox="-90.013605,39.996144,-89.994837,40.010663"/>
<County FIPS="17017" name="Cass"/>
<State FIPS="17" code="IL" name="Illinois"/>
</Response>
The problem I have is that I need to access the "name" value (i.e.,'Cass', in this case) contained in County node, and this value will be copied into the Excel spreadsheet under the County column. Is there a way to access this value? The XML response is not in the standard form I would expect (I'm new to XML), <County>Cass</County> so I'm unsure how I would access the value I need from this returned response.
The whole XML connection and response part of the script seem to be working fine, I just need to know how get the values from the response for the node in question.
Here is what I have so far. Any help would be greatly appreciated. If you need the full code, let me know.
standard XML connection stuff here...
XmlResponse = oXMLHTTP.responseText
'Process the XML to get County name
strXML = XmlResponse
Set XDoc = New MSXML2.DOMDocument60
If Not XDoc.LoadXML(XmlResponse) Then
Err.Raise XDoc.parseError.ErrorCode, , XDoc.parseError.reason
End If
Set xNode = XDoc.SelectSingleNode("/Response/County")
MsgBox xNode.Text
'Insert County name into Excel
Cells(i + 2, 14).Value = xNode.Text
I am assuming that the xNode.Text part is where I need help in selecting the right part from the response (?).
Many thanks!

An alternative via WorksheetFunction `FilterXML()
If you dispose of Excel vers. 2013+ you can execute the following:
Sub ExampleCall()
Dim myXML As String, myXPath As String
myXML = WorksheetFunction.WebService("https://geo.fcc.gov/api/census/block/find?latitude=40.000&longitude=-90.000&format=xml")
myXPath = "//County/#name"
Debug.Print WorksheetFunction.FilterXML(myXML, myXPath) ' ~> Cass
End Sub
Further hints to FilterXML() and its XPath argument
Starting by a double slash // the XPath string "//County/#name" searches
the <County> node at any hierarchy level returning
the subordinal #name attribute which has to be identified by a leading #. The FilterXML() function returns its textual content.
See FilterXML() function and WebService() function.
Of course it's possible to use both functions directly in worksheet formulae.

In searching around some more today I found a solution to my original question.
For those interested, you can access the County attribute 'name' in the returned xml response and write it out by replacing the above portion of code with the following:
Original:
Set xNode = XDoc.SelectSingleNode("/Response/County")
MsgBox xNode.Text
Updated:
Set xNode = XDoc.SelectSingleNode("//Response/County/#name")
MsgBox xNode.Text

Related

Trouble with Excel VBA parsing of XML file for ISBN title lookup: Run-time error 91 Object variable or With block variable not set

Given a Column A in Excel with multiple cells containing ISBN (book id) values, I want my VBA macro to loop through each of them and, for each one, parse an online XML file that is unique to that ISBN and put the corresponding book title in Column B.
For example, if A1 contains 1931498717, I should parse this XML and grab the title "Don't think of an elephant! : know your values and frame the debate : the essential guide for progressives" and put that in B1.
Here is a sample of the XML file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<classify xmlns="http://classify.oclc.org">
<response code="4"/>
<!--Classify is a product of OCLC Online Computer Library Center: http://classify.oclc.org-->
<workCount>2</workCount>
<start>0</start>
<maxRecs>25</maxRecs>
<orderBy>thold desc</orderBy>
<input type="isbn">1931498717</input>
<works>
<work author="Lakoff, George" editions="28" format="Book" holdings="1088" hyr="2014" itemtype="itemtype-book" lyr="2004" owi="796415685" schemes="DDC LCC" title="Don't think of an elephant! : know your values and frame the debate : the essential guide for progressives" wi="796415685"/>
<work author="Lakoff, George" editions="1" format="Musical score" holdings="1" hyr="2004" itemtype="itemtype-msscr" lyr="2004" owi="4735145535" schemes="DDC" title="Don't think of an elephant! : know your values and frame the debate : the essential guide for progressives" wi="4735145535"/>
</works>
</classify>
Notice there are two "work" elements. In this case, I am happy to just grab the title attribute from the first one. But even better would be to make sure it's the title of a book (format="Book") and not some other format.
Here is my macro code:
Sub ISBN()
Do
Dim xmlDoc As DOMDocument60
Set xmlDoc = New DOMDocument60
xmlDoc.async = False
xmlDoc.validateOnParse = False
r = CStr(ActiveCell.Value)
xmlDoc.Load ("http://classify.oclc.org/classify2/Classify?isbn=" + r + "&summary=true")
ActiveCell.Offset(0, 2).Value = xmlDoc.SelectSingleNode("/classify/works/work[1]").attributes.getNamedItem("title").text
ActiveCell.Offset(1, 0).Select
Loop Until IsEmpty(ActiveCell.Value)
End Sub
I get this error, "Run-time error 91 Object variable or With block variable not set," on the line that references "xmlDoc.SelectSingleNode("/classify/works/work[1]").attributes.getNamedItem("title").text"
I've tried numerous variations to try to isolate the title text but cannot get anything other than this error.
My Excel file is Microsoft Excel for Microsoft 365, on my laptop.
Help would be greatly appreciated. I am inexperienced in VBA programming and XML parsing and have been googling/reading on this for more time than I care to admit without making any progress. (There was a previous StackOverflow question on parsing ISBN XML files, but it was for a provider that no longer offers the XML files for free. That code inspired my code, but something was lost in the translation.)
Thanks tons for any help you can offer.
Your XML has a "default" namespace which applies to its contents, so when using xpath you need to create a dummy alias for that namespace and use it in your query.
See https://stackoverflow.com/a/72997440/478884
Eg:
Dim xmlDoc As DOMDocument60, col As Object, el As Object
Set xmlDoc = New DOMDocument60
xmlDoc.async = False
xmlDoc.validateOnParse = False
'set default namespace and alias "xx"
xmlDoc.SetProperty "SelectionNamespaces", _
"xmlns:xx='http://classify.oclc.org'"
xmlDoc.Load "http://classify.oclc.org/classify2/Classify?isbn=1931498717&summary=false"
Set col = xmlDoc.SelectNodes("/xx:classify/xx:works/xx:work")
'check each `work` for format and title
For Each el In col
Debug.Print "*****************"
Debug.Print el.Attributes.getNamedItem("format").Text
Debug.Print el.Attributes.getNamedItem("title").Text
Next el

Change built-in Document properties without opening

I am attempting to run the below line of code in a sub. The purpose of the sub overall is to automatically create agendas for recurring meetings, and notify the relevant people.
'Values for example;
MtgDate = CDate("11/06/2020")
Agenda ="Z:\Business Manual\10000 Management\11000 Management\11000 Communications\Operations Meetings\11335 - OPS CCAR Performance Review Agenda 11.06.20.docx" 'NB it's a string
'and the problematic line:
Word.Application.Documents(Agenda).BuiltinDocumentProperties("Publish Date") = MtgDate
Two questions:
1) Can I assign a document property just like that without opening the document? (bear in mind this vba is running from an excel sheet where the data is stored)
2) Will word.application.documents accept the document name as a string, or does it have to be some other sort of object or something? I don't really understand Word VBA.
Attempts so far have only resulted in
runtime error 427 "remote server machine does not exist or is
unavailable"
or something about a bad file name.
Although Publish Date can be found under Insert > Quick Parts > Document Property it isn't actually a document property. It is a "built-in" CustomXML part, a node of CoverPageProperties, and can be addressed in VBA using the CustomXMLParts collection.
The CustomXML part is only added to the document once the mapped content control is inserted.
Below is the code I use.
As already pointed out for document properties the document must be open.
Public Sub WriteCoverPageProp(ByVal strNodeName As String, ByVal strValue As String, _
Optional ByRef docTarget As Document = Nothing)
'* Nodes: Abstract, CompanyAddress, CompanyEmail, CompanyFax, CompanyPhone, PublishDate
'* NOTE: If writing PublishDate set the content control to store just the date (default is date and time).
'* The date is stored in the xml as YYYY-MM-DD so must be written in this format.
'* The content control setting will determine how the date is displayed.
Dim cxpTarget As CustomXMLPart
Dim cxnTarget As CustomXMLNode
Dim strNamespace As String
If docTarget Is Nothing Then Set docTarget = ActiveDocument
strNodeName = "/ns0:CoverPageProperties[1]/ns0:" & strNodeName
strNamespace = "http://schemas.microsoft.com/office/2006/coverPageProps"
Set cxpTarget = docTarget.CustomXMLParts.SelectByNamespace(strNamespace).item(1)
Set cxnTarget = cxpTarget.SelectSingleNode(strNodeName)
cxnTarget.Text = strValue
Set cxnTarget = Nothing
Set cxpTarget = Nothing
End Sub
You cannot modify a document without opening it. In any event, "Publish Date" is not a Built-in Document Property; if it exists, it's a custom one.
Contrary to what you've been told, not all BuiltinDocumentProperties are read-only; some, like wdPropertyAuthor ("Author"), are read-write.
There are three main ways you could modify a Word document or "traditional" property (which are the ones you can access via .BuiltInDocumentProperties and .CustomProperties):
a. via the Object Model (as you are currently trying to do)
b. for a .docx, either unzipping the .docx, modifying the relevant XML part, and re-zipping the .docx.
c. For "traditional" properties, i.e. the things that you can access via .BuiltInDocumentProperties and .CustomDocumentProperties, in theory you can use a Microsoft .dll called dsofile.dll. But it hasn't been supported for a long time, won't work on Mac Word and the Microsoft download won't work on 64-bit Word. You'd also have to distribute and support it.
But in any case, "Publish Date" is not a traditional built-in property. It's probably, but not necessarily, a newer type of property called a "Cover Page Property". Those properties are in fact pretty much as "built-in" as the traditional properties but cannot be accessed via .BuiltInDocumentProperties.
To modify Cover Page properties, you can either use the object model or method (b) to access the Custom XML Part in which their data is stored. Method (c) is no help there.
Not sure where your error 427 is coming from, but I would guess from what you say that you are trying to see if you can modify the property in a single line, using the fullname of the document in an attempt to get Word to open it. No, you can't do that - you have to use GetObject/CreateObject/New to make a reference to an instance of Word (let's call it "wapp"), then (say)
Dim wdoc As Word.Document ' or As Object
Set wdoc = wapp.Documents.Open("the fullname of the document")
Then you can access its properties, e.g. for the read/write Title property you can do
wdoc.BuiltInDocumentProperties("Title") = "your new title"
wdoc.Save
If Publish Date is the Cover Page Property, once you have a reference to the Word Application and have ensured the document is open you can use code along the following lines:
Sub modPublishDate(theDoc As Word.Document, theDate As String)
' You need to format theDate - by default, Word expects an xsd:dateTime,
' e.g. 2020-06-11T00:00:00 if you only care about the date.
Const CPPUri As String = "http://schemas.microsoft.com/office/2006/coverPageProps"
Dim cxn As Office.CustomXMLNode
Dim cxps As Office.CustomXMLParts
Dim nsprefix As String
Set cxps = theDoc.CustomXMLParts.SelectByNamespace(CPPUri)
If cxps.Count > 0 Then
With cxps(1)
nsprefix = .NamespaceManager.LookupPrefix(CPPUri)
Set cxn = .SelectSingleNode(nsprefix & ":CoverPageProperties[1]/" & nsprefix & ":PublishDate[1]") '/PublishDate[1]")
If Not (cxn Is Nothing) Then
cxn.Text = theDate
Set cxn = Nothing
End If
End With
End If
Set cxps = Nothing
As for this, "Will word.application.documents accept the document name as a string", the answer is "yes", but Word has to have opened the document already. as mentioned above. Word can also accept an integer index into the .Documents collection and may accept just the name part of the FullName string.
Finally, if you do end up using a "traditional Custom Document Property", even after you have set the property and saved the document (approximately as above) you may find that the new property value has not actually saved! If so, that's down to an old error in Word where it won't save unless you have actually visited the Custom Document Property Dialog or have modified the document content in some way, e.g. adding a space at the end.

Create Revit Sheets from Excel with multiple project parameters

Hi I have a macro that creates multiple sheets and has Name, Number, Sheet Category. The last being my own project parameter.
I can successfully create the sheets with name and number but I am having difficulty adding the value from CSV file to the project parameter "SD_Sheet Category". Below are some code grabs to help explain.
Public Module mSheetCreator
Public Structure structSheet
Public sheetNum As String
Public sheetName As String
Public sortCategory As String
End Structure
Then I have a function that reads the CSV file and does the following:
Try
currentRow = MyReader.ReadFields()
'create temp sheet
Dim curSheet As New structSheet
curSheet.sheetNum = currentRow(0)
cursheet.sheetName = currentRow(1)
curSheet.sortCategory = currentRow(4)
'add sheet to list
sheetList.Add(curSheet)
Then I have a transaction that does the following:
For Each curSheet As structSheet In sheetList
Try
If sheetType = "Placeholder Sheet" Then
m_vs = ViewSheet.CreatePlaceholder(curDoc)
Else
m_vs = ViewSheet.Create(curDoc, tblock.Id)
m_vs.Parameter("SD_Sheet Category").Set(CStr(curSheet.sortCategory))
End If
m_vs.Name = curSheet.sheetName
m_vs.SheetNumber = curSheet.sheetNum
The problem is this code:
m_vs.Parameter("SD_Sheet Category").Set(CStr(curSheet.sortCategory))
I am getting a warning that says it's an "implicit conversion from 'String' to 'Autodesk.Revit.DB.BuiltInParameter'"
once I build solution
When I run the code in Revit it produces an error saying
"Conversion from string 'SD_Sheet Category" to type 'Integer' is not valid"
It creates the sheets but disregards all the info in the CSV file. I know the rest of the code works as I have removed this particular line of code so I know it isn't the problem
Any suggestions??
I believe that as of a particular version of Revit API you cannot use .Parameter(“name”)
Because there might be two parameters with the same name.
So you need to do something more like
Dim cat as Parameter
Cat = m_vs.GetParameters(“sd_sheet category”).First()
Cat.Set(CSTR(cursht.sortCategory))
Good luck!

How to retrieve specific information from XML type document

I`m working on a VBA macro in Excel, to gather information from a CNC program code.
So far, I have gotten Material type, thickness, x & Y sizes, and qty used.
I`m trying to get the 'cutting length' now - so I can use it in costing calculations.
Here is the XML code segment :
<Info num="6" name="Tools">
<MC machine="psys_ETN_5">
<Tool name="TN901" length="16262.96209" time="53.72817301" cutoutArea="8138.657052"/>
</MC>
</Info>
There are lots of 'Info' lines.
There may be more than one 'Tool' line, but I`m only after anything from line with 'TN901'.
The data I`m trying to capture is the value of 'Length="######.##"'
I`ve captured everything else I need from code like this :
<Material>316</Material>
<SheetX>2000</SheetX>
<SheetY>1000</SheetY>
<Thickness>3</Thickness>
</Material>
using code like this:
For Each nodemat In XMLDataDrg.SelectNodes("//Material")
Matl = nodemat.Text
Worksheets("Sheet4").Range("H" & RowA).Value = Matl
Next
For Each nodesht In XMLDataDrg.SelectNodes("//Thickness")
Thk = nodesht.Text
Worksheets("Sheet4").Range("I" & RowA).Value = Thk
Next
But that type of code does not get the cutting length.
Any help please ? :)
Thanks
Simon
Thickness is saved as XML element in your example.
The length is stored as an XML attribute.
(see https://www.xmlfiles.com/xml/xml-attributes/)
To read an XML attribute please have a look at:
Read XML Attribute VBA
Based on the code presented there you should be able to solve your issue with:
'Include a reference to Microsoft XML v3
Dim XMLDataDrg As DOMDocument30
Set XMLDataDrg = New DOMDocument30
XMLDataDrg.Load ("C:\...\sample.xml")
'...
Dim id As String
id = XMLDataDrg.SelectSingleNode("//Info/MC/Tool").Attributes.getNamedItem("length").Text
You can use an xpath to restrict to Tool elements with attribute name having value TN901 then loop all the attributes and write out. I am reading your XML from a file on desktop.
Option Explicit
Public Sub test()
Dim xmlDoc As Object
Set xmlDoc = CreateObject("MSXML2.DOMDocument")
With xmlDoc
.validateOnParse = True
.setProperty "SelectionLanguage", "XPath"
.async = False
If Not .Load("C:\Users\User\Desktop\Test.xml") Then
Err.Raise .parseError.ErrorCode, , .parseError.reason
End If
End With
Dim elem As Object, attrib As Object
For Each elem In xmlDoc.SelectNodes("//Tool[#name='TN901']")
For Each attrib In elem.Attributes
Debug.Print attrib.nodeName, attrib.Text
Next
Next
End Sub
Result:

VBA getelementsbytagname issue

Good morning,
I'm attempting to extract HTML table information and collate results on en excel spreadsheet.
I'm using the getelementsbytagname("table")(0) function to extract the HTML table info, which has worked well. Can someone please tell me what is the significance of the (0) after the table?
Also, I have an instance where an opened webpage does not have any table information to process (I don't know this until the page is opened), this leads to an error in my code as I try to initialize my data array to the table dimensions. Is there a way of extracting a result from getelementsbytagname("table")(0), I've tried:-
If (iDom.getelementsbytagname("table")(0) = 0) Then
but this returns a run time error.
Many thanks in advance for your help.
First add reference to Microsoft Internet Controls (SHDocVw) and to Microsoft HTML Object Library:
Then the Object Explorer is your friend:
So getElementsByTagName returns IHTMLElementCollection which has property length. When on the page some elements with specific tag name are found then length is greater then zero. HTH
Dim tables As IHTMLElementCollection
Set tables = doc.getElementsByTagName("table")
If tables.Length > 0 Then
Dim table As HTMLTable
Set table = tables.item(0)
' Because item is the default property of IHTMLElementCollection we can simplyfy
Set table = tables(0) ' this is the same as tables.item(0)
End If
In VBA the appended (0) refers to the first element of an array (assuming an Option Base 0). Here is a short example:
vArr = Array("element 1", "element 2", "element 3")
Debug.Print v(1)
The above code should return element 2 as the second element of a zero-based array.
So, getelementsbytagname("table")(0) refers to the first element of that table. Yet, if the "table" is not found then there is no array to get from that table and getting the first element from that array (by appending (0)) yields an error.
Instead you should test if there is actually a table by that name (before trying to access the array of elements within that table) like so:
If (iDom.getelementsbytagname("table") = 0) Then

Resources