Forcing documentMode when using MSHTML.HTMLDocument - excel

Can HTMLDocument be forced to a specific documentMode when using MSHTML in Excel?
So far, all properties and methods related to this seem to only return values and cannot be set (ex. documentMode, compatMode, compatible).
While scraping and parsing HTML, I'm getting different behaviours in Excel on other machines in the organization which is why I want to standardize as much as I can.
Code:
Dim doc As HTMLDocument
Set doc = New HTMLDocument
Debug.Print "compatMode: " & doc.compatMode
Debug.Print "documentMode: " & doc.documentMode
My machine:
compatMode: BackCompat
documentMode: 11
Other machines:
compatMode: BackCompat
documentMode: 5
For the systems I compared with, the OS builds and MS Office (O365) versions were the same as my machine. I also compared the version of msxml3.dll and msxml6.dll which were also the same with my machine.

Instead of MSXML2.XMLHTTP.6.0, I used an instance of InternetExplorer
Instead of instantiating various classes from MSHTML, I simply used the generic Object class. Instantiating anything from MSHTML would introduce different documentModes.
Code example:
'Get document using IE
Dim doc As HTMLDocument
...
set doc = ie.Document
'Old - Extracting rows
Dim element As MSHTML.HTMLGenericElement
For Each element In tableRows
'New - Extracting rows
Dim element As Object
For Each element In tableRows

Related

Find / replace text in embedded word object code stopped working

I have used this code successfully to replace content in an embedded word object from excel. I copied the code for a new excel file but now it doesn't work. It opens the file but doesn't replace although I can see that it IS finding the right text and replacement text. I'm kind of lost as to what is happening.
Dim strFindText As Range
Dim strReplaceText As Range
Dim nSplitItem As Long
Set strFindText = ActiveWorkbook.Worksheets("Utilisation Form").Range("c11:c20")
Set strReplaceText = ActiveWorkbook.Worksheets("Utilisation Form").Range("a11:a20")
nSplitItem = strFindText.Count
Debug.Print strFindText.Item(0)
For Each sh In ThisWorkbook.Sheets("Utilisation Form").Shapes
If sh.Name <> "Object 1" Then sh.Delete
Next
Set urobj = ThisWorkbook.Sheets("Utilisation Form").OLEObjects("Object 1")
Set wordtemp = urobj.Duplicate
wordtemp.Verb Verb:=xlOpen
Set wordtemp2 = wordtemp.Object
For x = 1 To nSplitItem
With wordtemp2.Content.Find
.Forward = True
.Text = strFindText.Item(x)
.ClearFormatting
.Replacement.Text = strReplaceText.Item(x)
.Execute Replace:=wdReplaceAll
End With
Next x
End Sub
Thanks for the support
When the early-binding technology is used in the code you need to add a corresponding COM reference to be able to use data types. Otherwise, you need to declare everything from the Word object model as Object in the code and use the late-binding technology.
To use early binding on an object, you need to know what its v-table looks like. In Visual Basic, you can do this by adding a reference to a type library that describes the object, its interface (v-table), and all the functions that can be called on the object. Once that is done, you can declare an object as being a certain type, then set and use that object using the v-table. For example, if you wanted to Automate Microsoft Office Excel using early binding, you would add a reference to the Microsoft Excel X.0 Object Library from the Project|References dialog, and then declare your variable as being of the type Excel.Application. From then on, all calls made to your object variable would be early bound.
Read more about that in the Using early binding and late binding in Automation article.

Trying to create an Entity Relationship Database from Excel using Visio Standard

I'm trying to use my company's software, Visio Standard, to create an entity relationship database using Excel. Usually the team has been creating this manually due to not having access to the Professional versions. With a mulitude of entities, the process is extremely tedious doing this one by one. I am trying to import from Excel to Visio without that pro version.
Theoretically the excel template would have Entity Name, Entity Structure (P'ship, Corp, DRE, Individual, ect.) and whatever else information needed to automatically populate into excel.
I have a background in VBA so that could be utilized, I just keep running into roadblocks due to the lack of tabs that the standard version has, including the main Data tab for import.
Is there any way I can import my data from Excel into Visio then run a code to convert it into shapes? What about my own custom template?
We make entity relationship diagrams often so one template would not work. We have a standard shapes & stencils that is used across the board, but the ERD is never the same. I thought I needed a template but I realized that I can't convert a personal template to a wizard or import an excel to the template that the template becomes quite useless.
#Surrogate My idea is that I want to pull the data from a template in excel to automatically create the ERD (or close to it) to save a large sum of time creating those entities through the shapes one by one. I think the template in Excel being so basic, with header columns for the Name of the Entity, Shape to use, hierarchy ladder; VBA does come into play pretty easily, just unsure how to mess around with that since I can't import excel into Visio through the standard version
#y4cine I am stuck because I cannot import data from excel in the standard version.
#TimWilliams I'm not capable of poaching to paying for the pro version, so regardless of the "fun" I would like to see if I could work around the pro version to do what the ERD/wizard can do, even if it requires a large VBA macro.
because I cannot import data from excel in the standard version
This example uses early binding.
In VBA you need to set a reference to the Excel Library.
It sets prop values in already existing shapes. The link being the shape ID.
If you rather need to draw new shapes, I' recommend using a master.
something like:
dim oMaster as master
dim oStencil as document
set oStencil = Application.Documents("myStencil")
set oMaster = oStencil.Masters("myMaster")
then inside the loop:
define some coordinates for x and y
set shp = activepage.drop(oMaster,x,y)
The function:
Public Function excelImport(filename As String) As Boolean
Dim xlsWorkbook As Excel.Workbook
Dim xlsSheet As Excel.Worksheet
Dim shp As Visio.Shape
Dim num_rows As Integer
Dim row As Integer
Dim shpID As String
Set xlsWorkbook = Excel.Workbooks.Open(filename)
Set xlsSheet = xlsWorkbook.Worksheets(1)
num_rows = xlsSheet.Range("A65000").End(xlUp).row
For row = 2 To num_rows
shpID = xlsSheet.Range("P" & row).FormulaR1C1
If Not shpID = "" Then
Set shp = ActivePage.Shapes.ItemFromID(CLng(shpID))
shp.Cells("prop.SoAndSo").Formula = Chr(34) & xlsSheet.Range("A" & row).FormulaR1C1 & Chr(34)
End If
Next row
xlsSheet.Application.Quit
Set xlsSheet = Nothing
Set xlsWorkbook = Nothing
excelImport = True
End Function

Cant retrieve element from page

I'm having trouble trying to retrieve the IUPAC name of a chemical on the following page:
https://echa.europa.eu/brief-profile/-/briefprofile/100.000.685
I'd simply like the printed result to return as Benzene in this example.
The code below pulls all elements with className `
Public Sub GetContents()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
XMLReq.Open "Get", "https://echa.europa.eu/brief-profile/-/briefprofile/100.000.685", False
XMLReq.send
HTMLDoc.body.innerHTML = XMLReq.responseText
Set IUPACName = HTMLDoc.getElementsByClassName("col-sm-8")(0)
Debug.Print IUPACName.innerText
End Sub
This returns:
EC / List name: IUPAC name: benzene Substance names and other identifiers
Inspecting the page there doesn't seem to be any obvious identifier to just return Benzene. Wondering how people would go about this.
Here is an image of the Text I wish to pull.
I can't test on other Office versions but 2019, at least, you can use an attribute selector as follows:
Set IUPACName = HTMLDoc.querySelector("[title*=IUPAC]")
Debug.Print IUPACName.innerText
I was expecting to use:
Debug.Print IUPACName.NextSibling.NodeValue
So, that latter one maybe what you need on your Office version.
The world of mshtml.dll is quite topsy-turvy as moment.

VBA: Extracting XML data from XFA form and passing to XML parser

I am trying to extract the XML information from an XFA form using VBA.
Below code works to extract the XML data to a separate file, but it requires user interaction (the user is requested to give the XML file a name). I have given up trying to automate this without user interaction due to Adobe's "safe path" requirement (which seems impossible to bypass with a VBA automation).
Dim objPDDoc As New AcroPDDoc
Dim objJSO As Object
Dim strSafePath as String
strSafePath = ""
objPDDoc.Open (FileName)
Set objJSO = objPDDoc.GetJSObject
objJSO.xfa.host.exportdata strSafePath, 0
What I would rather do is to parse the XML information directly using MSXML2.DOMDocument60. I was hoping to be able to do something like this:
Dim XMLDoc As New MSXML2.DOMDocument60
If XMLDoc.Load(objJSO.xfa.host.exportdata) = True Then
Call funcParse(XMLDoc)
End if
However, loading XMLDoc with objJSO.xfa.host.exportdata doesn't work, and I cannot seem to figure out which - if any - possibilities there are to pass the XML information using any xfa.host methods/properties.
Any help is welcome - also telling me this is not possible in VBA.
Try something like this:
myXMLstring = "<XML>BLA</XML>"
Dim xmlDoc As MSXML2.DOMDocument60
Set xmlDoc = New MSXML2.DOMDocument60
xmlDoc.LoadXML myXMLstring
See for a better example: See e.g. this post: https://desmondoshiwambo.wordpress.com/2012/07/03/how-to-load-xml-from-a-local-file-with-msxml2-domdocument-6-0-and-loadxml-using-vba/
Original poster here. After about a year of looking into this on-and-off, I found the solution.
After having accessed the JavaScript object through AccroPDDoc.GetJSObject, I can extract the nested XML as a string by using objJSO.xfa.this.saveXML.
This way, I don't have to first save the nested XML to file (which would require user interaction) - instead I can immediatly extract the nested XML and pass it to the parser.
Dim objPDDoc as New AcroPDDoc
Dim objJSO as Object
Dim XMLDoc As New MSXML2.DOMDocument60
ObjPDDoc.Open (Filename)
Set objJSO = objPDDoc.GetJSObject
If XMLDoc.LoadXML (objJSO.xfa.this.saveXML) = True then
ParseXML(XMLDoc)
End if

Use excel vba to Input value on an .asp page

I'm writing VBA code in Excel to generate various reports.
Once it's done I would really like (and I would be looked at like a hero by my fellow co-worker) to input my results directly on our corporate intranet.
So I've started educating myself on how to use VBA to interact with Internet Explorer. I know get the basics so I can do cool stuff (but unusefull in this case ) like loading a web site. But when I try to input values in a text box on this page on the Intranet, I can't go any where.
I'm suspecting that the problem is caused by the fact that the adress I'm accessing is ending with .asp extension.
Here's the code I'm using below
Beware that I will most probably have other questions following this first one. You might just become my new-web-geek-bestfrient ;-)
Sub interaction()
'Variables declaration
Dim IE As New InternetExplorer
Dim IEDoc As HTMLDocument
Dim ZoneMotsClés As HTMLInputElement
'page to be loaded, it's on a corporate intranet
IE.navigate "http://intranet.cima.ca/fr/application/paq/projets/index.asp"
IE.Visible = True
Do ' Wait till the Browser is loaded
Loop Until IE.readyState = READYSTATE_COMPLETE
Set IEDoc = IE.document
'this is the text zone that I'm trying to input value into
Set ZoneMotsClés = IEDoc.getElementById("txtMotCle")
'this is where it crashes. At this point I'm only trying to enter a project number into
'the "txtMotCle" 'text zone
ZoneMotsClés.Value = "Q141763B"
'.....
Set IE = Nothing
Set IEDoc = Nothing
End Sub
So at this point (when I try to input the value in the text box I get a:
error 91 object variable or with block variable not set
and here's the html code of the section on the page I'm trying to write in.
<INPUT onfocus="javascript:document.frmMyForm.TypeRecherche.value='simple';"
style="FONT-SIZE: 9px; FONT-FAMILY: verdana" maxLength=250 size=60 name=txtMotCle>
This time I tried the suggestions of the 2 contributors (Tx Jeeped and Tim Williams) but still getting the same error 91.
Now I tried that modification (tx SeardAndResQ)
'this is the text zone that I'm trying to input into
'Set ZoneMotsClés = IEDoc.all("txtMotCle")
ID = "txtMotCle"
Set ZoneMotsClés = IEDoc.getElementById(ID)
'this is where it crashes. At this point I'm only trying to enter a project number into the "txtMotCle"
'text zone
ZoneMotsClés.Value = "Q141763B"
Same result. I'm not sure I made it the way #searchAndResQ meant it

Resources