Read straight web content with Excel VBA - excel

there are many article on this site on how to read tags and tables in web sites with Excel VBA, but I am stuck here.
This website gives me business locations after entering a Zip code.
("Where is the closest location relative to my Zip Code")
I managed to navigate to the site, enter the Zip code and click Submit:
Dim Browser As SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Set Browser = New SHDocVw.InternetExplorer ' create a browser
Browser.Visible = True ' make it visible
Application.StatusBar = ".... opening page"
Browser.navigate "https://www.thewebsite.com" ' navigate to page
WaitForBrowser Browser, 1 ' wait for completion or timeout
Application.StatusBar = "gaining control over DOM object"
Set HTMLDoc = Browser.document ' load the DOM object
WaitForBrowser Browser, 1
HTMLDoc.getElementById("ZipCode").Value = "28278"
HTMLDoc.getElementById("localTeamZipSubmit").Click
The site opens and the relevant content looks like this:
<div>
<div class="columns">
<div class="column boldText paddingFive" style="padding-left: 20px; width: 70px;">
Location:
</div>
<div class="column paddingTopFive">CHARLOTTE</div>
</div>
<div class="columns">
<div class="column boldText paddingFive" style="padding-left: 20px; width: 120px;">
Location Number:
</div>
<div class="column paddingTopFive">102340</div>
</div>
<div class="columns">
<div class="column boldText paddingTopFive paddingLeftTwenty" style="vertical-align: top;">
Address:
</div>
<div class="column paddingTopFive paddingLeftTwenty">
<div>8848 Main St.</div>
<div>Suite F</div>
<div></div>
<div>Charlotte, NC 27218</div>
</div>
</div>
<div class="columns">
<div class="column boldText paddingFive" style="padding-left: 20px; width: 70px;">
Phone:
</div>
<div class="column paddingTopFive">(704) 911-4440</div>
</div>
<div class="columns">
<div class="column boldText paddingFive" style="padding-left: 20px; width: 70px;">
Fax:
</div>
<div class="column paddingTopFive">(704) 911-4441</div>
</div>
</div>
As you can see, this section has no table, no named tags and classes that are use over and over.
I was not able to read this information yet. I would be happy to get the whole blob into a String and parse it"
"Text = HTMLDoc.getEverything()"
Thanks a lot for your help!!!
In the meantime I found another code snippet that I modified but I am getting stuck at the same point:
Post and submit works but how to get the answer....
{ Private Sub PostalCodes()
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
On Error GoTo errHandler
ie.Visible = 1
With ie
.navigate "https://www.pattersondental.com/ContactUs/MyLocalTeam"
Do While .busy: DoEvents: Loop
Do While .ReadyState <> 4: DoEvents: Loop
With .document.Forms("GetBranchFromZipForm")
.ZipCode.Value = "28273"
.submit
End With
' Do While Not CBool(InStrB(1, .document.URL, _
' "cp_search_response-e.asp"))
' DoEvents
' Loop
Do While .busy: DoEvents: Loop
Do While .ReadyState <> 4: DoEvents: Loop
' MsgBox .document.all.tags("Colums").Item(1).Rows(1).Cells(1).innerText
MsgBox .document.all.tags("Colums").innerText
' MsgBox .document}
I guess I have to search no for "how to dissect a HTML document"...
Add on:
It seems that while ie is a valid item (in the watch window) IE.Document is empty... why can this be, The website is still there with new data.
I even tried another code snippet that looks for open websites in IE, it finds the site (with the correct data) but the document is still empty and getelementBY... does not find anything of course.
I am about to start drinking...

I can't believe it.
After 3 days of poking I found this:
With ActiveSheet.QueryTables.Add(Connection:="URL;
https://www.pattersondental.com/ContactUs/MyLocalTeam",
Destination:=Range("A1"))
.PostText = "ZipCode=70032"
.RefreshStyle = xlOverwriteCells
.SaveData = True
.Refresh
I don't pretend to understand why it works, but is does.
John, I will still check out, what you suggested. Thanks

Related

HTML extracted from website in Excel comes with missing data

I am trying to extract some data from a web page using the following VBA code in Excel:
Sub Extractor_data()
Dim http As New XMLHTTP60, html As New HTMLDocument
With http
.Open "GET", "https://www.pantone.com/connect/11-0105-TPX", False
.send
html.body.innerHTML = .responseText
End With
Range("A1") = html.querySelectorAll("#maincontent div div div div article div.bg")(0).innerHTML
End Sub
But what I get from it has missing data.
I get:
<div class="square" :style="squareStyle"></div>
<div class="code" v-if="color">
<p v-t="i18n.pantoneTiTle"></p>
<p v-text="color.code"></p>
<p v-if="color.name && color.name !== color.code" v-text="color.name"></p>
</div>
I was supposed to get:
<div class="square" style="background-color: rgb(239, 239, 223);"></div>
<div class="code">
<p>PANTONE</p>
<p>11-0104 TPX</p>
<p>Vanilla Ice</p>
</div>
I have tried to understand what's going on, I have tried some slightly different approaches, all to no avail.
Does anyone has an explanation and, hopefully, a fix?
Thank you.

How to add a tick to a checkbox through excel vba ie automation?

The checkbox I am trying to add a tick to is part of an online table and doesn't seem to be coded as a checkbox.
I have tried the following to add a tick and none work:
IE.Document.getElementByID("gridcolumn-1658-titleEl").Click
IE.Document.getElementByID("gridcolumn-1658-textEl").Click
IE.Document.getElementsByClassName("x-column-header-inner x-column-header-over")(0).Click
IE.Document.getElementsByClassName("x-column-header-inner")(0).Click
When I inspect the element, I get to the following, but none of this looks anything like a checkbox (the third one is the actual reference):
<div class="x-column-header x-column-header-checkbox x-column-header-align-left x-box-item x-column-header-default x-unselectable x-column-header-first" style="border-width: 1px; width: 24px; right: auto; left: 0px; top: 0px; margin: 0px; height: 24px;" id="gridcolumn-1658"><div id="gridcolumn-1658-titleEl" class="x-column-header-inner" style="padding-top: 6px; padding-bottom: 6px;"><span id="gridcolumn-1658-textEl" class="x-column-header-text"> </span></div></div>
<div id="gridcolumn-1658-titleEl" class="x-column-header-inner" style="padding-top: 6px; padding-bottom: 6px;"><span id="gridcolumn-1658-textEl" class="x-column-header-text"> </span></div>
<span id="gridcolumn-1658-textEl" class="x-column-header-text"> </span>
The website is OptimoRoute, which can be accessed fairly quickly using a new log in, for those interested!
The button I am trying to click is the top one in the table.
Please refer to the following sample code:
Sub main()
'we define the essential variables
Dim IE As Object, Data As Object
Dim ticket As String
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.navigate ("https://dillion132.github.io/vbacheckbox.html")
While IE.ReadyState <> 4
DoEvents
Wend
Set Data = IE.Document.getElementsByClassName("check")
Debug.Print Data.Length
If Len(Data) > 0 Then
For Each ee In Data
'Debug.Print ee.Value
'Based on the checkbox value to check/uncheck the checkbox.
If ee.Value = "Cat" Then
ee.Checked = True
End If
'check whether the checkbox is checked, then, get the checked value.
If ee.Checked Then
Debug.Print ee.Value & " is checked"
End If
Next ee
End If
End With
Set IE = Nothing
End Sub
Code in the website:
<input class="check" type="checkbox" value="Cat"> Cat </input>
<br />
<input class="check" type="checkbox" value="Dog" >Dog</input>
<br />
<input class="check" type="checkbox" value="Pig" checked ="checked" >Pig</input>
The result as below:

How to click on the button after data paste

I am trying to click on the button after I have pasted a value on a text box. However none of the code I tried seems to work. On the same column there are 2 text boxes and few buttons. I managed to open up a new frame after clicked on the 1st button with the below code:
Set the_button_elements = doc.getElementsByTagName("div")
For Each button_element In the_button_elements
If button_element.getAttribute("class") = "pzbtn-mid" Then
button_element.Click
Exit For
End If
Next button_element
I have tried by changing the tag name and attribution and also the below code but it still doesn't work:
doc.querySelectorAll("[type=button]").item(3).Click
Below is the element for the 1st button which work with the above code:
<button type="button" class="pzhc" id="AcctNumber"
disabled="" onclick="this.disabled=true;javascript: LoadAcct();"
title="Search">
<div class="pzbtn-lft">
<div class="pzbtn-rgt">
<div class="pzbtn-mid" data-click="...">
<img class="pzbtn-i">Go</div></div></div></button>
Below is the element for the next button which I am trying to find the code to make it work:
<button type="button" class="pzhc" id="SearchBtn"
disabled="true"onclick="getCaseDetails();">
<div class="pzbtn-lft">
<div class="pzbtn-rgt">`enter code here`
<div class="pzbtn-mid">
<img class="pzbtn-i">Search</div></div></div></button>
Appreciate someone could guide me by providing me the code that click on the button as the button is still grey out even after the text pasted. However this is the same situation for the 1st code but it still works.
Perhaps the issue relates to the disabled attribute. If we want to click the button, we should make sure it is enabled status. After clicking the first button or in the textbox change event, you could use the following JavaScript code to enable the Search button:
document.getElementById("SearchBtn").disabled = false;
You could check the following sample:
<button type="button" class="pzhc" id="AcctNumber" onclick='this.disabled=true;javascript: alert("Go"); document.getElementById("SearchBtn").disabled = false;'
title="Search">
<div class="pzbtn-lft">
<div class="pzbtn-rgt">
<div class="pzbtn-mid" data-click="...">
<img class="pzbtn-i">Go
</div>
</div>
</div>
</button>
<button type="button" class="pzhc" id="SearchBtn"
disabled="true" onclick='javascript: alert("Search");'>
<div class="pzbtn-lft">
<div class="pzbtn-rgt">
`enter code here`
<div class="pzbtn-mid">
<img class="pzbtn-i">Search
</div>
</div>
</div>
</button>
Then, if I use the following VBA script click the button, it will click the Go button, then, enable the search button and click it:
Sub login()
Const Url$ = "https://dillion132.github.io/vbatestpage.html"
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
With ie
.navigate Url
ieBusy ie
.Visible = True
'Find the related button
Dim button_go_element As Object, button_search As Object
Set button_go_element = .document.getElementById("AcctNumber")
Set button_search = .document.getElementById("SearchBtn")
button_go_element.Click
button_search.Click
End With
End Sub
Sub ieBusy(ie As Object)
Do While ie.Busy Or ie.readyState < 4
DoEvents
Loop
End Sub

Uploading to the Web with VBA

<div id="xe-editor-container-1" class="input_area xpress_xeditor_editing_area_container" style="height: 400px;">
<iframe id="editor_iframe_1" allowtransparency="true" frameborder="0" src="http://my_URL.or.kr/xe/modules/editor/styles/default/editor.html" scrolling="yes" style="width: 100%; height: 400px; display: block;">
<html xmlns="http://www.w3.org/1999/xhtml>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel="stylesheet" type="text/css" href="editor.css">
<title>XpressEngine</title>
</head>
<body class="xe_content editable"></body>
</html>
</iframe>
<textarea id="xpress-editor-1" rows="8" cols="42" style="display: none; width: 100%;"></textarea>
<textarea rows="8" cols="42" class="input_syntax " style="display:none"></textarea>
</div>
I want to copy two tables, ListObjects ("Tbl1") on Sheet1, ListObjects ("Tbl2") on Sheet2, and upload them as a single post on the web.
The range of the table can be changed every time.
Logging in to the web, navigating to the bulletin board, pressing the write button and typing the title succeeded.
But I have failed to upload the post.
Perhaps you can not find the bulletin board object.
Code I created by searching the web.
The HTML above is the board's HTML code.
With ie
.navigate "http://my_URL/offering"
ieBusy ie 'Procedure fetched as search (check the ready status)
.Document.getElementsByClassName("ico_16px write")(0).Click
ieBusy ie
Dim oTitle As Object, Ocontents As Object
Set oTitle = .Document.getElementsByname("title")(0) 'Sometimes fail(sometimes Nothing)
Set oContents = .Document.getElementsByClassName("xe_content editable")(0) 'evry time fail(=Nothing)
oTitle.Value = "my Title"
oContents.Value = ????
.Document.forms(0).submit 'I could not confirm it because it did not work anymore.
End With
Sub ieBusy(ie As Object)
Do While ie.Busy Or ie.readyState < 4
DoEvents
Loop
End Sub
title HTML
board HTML
1) Use an additional timed loop to set oTitle as per https://stackoverflow.com/a/55334183/6241235
2) Your oContents variable is selecting for an element which is inside an iframe I think. I would expect you to instead be targeting a textarea element. There are two that come after the iframe. The first has id xpress-editor-1

Trying to select an item in kendogrid kendoDropDownList using Excel VBA

I'm trying to select an item on a dropdown list on an internal company web page. I am able to set text entry items and get the dropdownlist to open, but I' having a hard time figuring out how to make a selection
In other area I'm able to make a selection using the ID and td/tr tags, but this routine doesn't have any tr/td tags with entry names.
Here'a couple of things I've tried so far:
These two items perform the same function and work fine for opening the first dropdown
ie.Document.getElementById("FromDistrict").Click
ie.Document.parentWindow.execScript "$('#FromDistrict').kendoDropDownList('open');"
I've tried several variations of these:
ie.Document.parentWindow.execScript "$('#FromDistrict').data('kendoGrid').dataItem($('transport').data('kendoDropDownList').select('KILGORE'));"
ie.Document.parentWindow.execScript "$('#FromDistrict').data('kendoGrid').data('kendoDropDownList').select('KILGORE'));"
ie.Document.parentWindow.execScript "$('#FromDistrict').data('kendoGrid').data('kendoDropDownList').select('eq:0'));"
ie.Document.parentWindow.execScript "$('#FromDistrict').data('kendoDropDownList').select('KILGORE'));"
ie.Document.parentWindow.execScript "$('#FromDistrict').select('KILGORE'));"
Viewing the source code for this particular segment is:
</div>
<form action="/TransferLoad/Add" method="post"><input name="__RequestVerificationToken" _
type="hidden" value="IP80d5XM-Qi0XQ1-IgGKGmhLVNGdtDAyM-r7lJ6yQCI1RIdJJph0uPnz-DzEHx12_booO4xwvcWg6EUWPiLnHv7ww6PD-aqfhiVxPcy-VYm6mnBRHsba3H7Hembliybo0" /> _
<div class="k-block k-info-colored">
<div class="k-header">
<span>Add Transfer Load Details</span>
</div>
<div class="k-content">
<div class="infocontainer">
<table>
<tr>
<td class="columnLabel">
<label for="From_District:">From District:</label>
</td>
<td class="columnData">
<input id="FromDistrict" name="FromDistrict" style="width: 200px" type="text" />
<script>
jQuery(function(){jQuery("#FromDistrict").kendoDropDownList({"dataSource" _
:{"transport":{"read": {"url":"/DistrictProfiles/GetUserDistricts","data": _
function() { return kendo.ui.DropDownList.requestData(jQuery("#FromDistrict")); }}, _
"prefix":""}, "serverFiltering":true,"filter":[],"error":OnError, _
"schema": {"errors":"Errors"}}, "dataTextField":"DistrictName","autoBind":true, _
"dataValueField":"DistrictCode", "optionLabel":"Select District..."});});
</script>
</td>
<td class="columnLabel"> 'Next dropdown section starts here
<label for="To_District:">To District:</label>
</td>
When the dropdown opens, it has 2 items to choose from, but nowhere in the code can I find those 2 items listed, so I'm assuming they're pulled from this line: return kendo.ui.DropDownList.requestData(jQuery("#FromDistrict")) , but I'm not sure. Can someone point out what I'm missing here ?
I did not post the "view element" because of the difficulty in copying it. All selections dynamically change other selection options.
IN the browser DOM explorer (which shows the markup 'computed/sanitized' by the rendering engine) you should see that
<input id="FromDistrict" name="FromDistrict" style="width: 200px" type="text" />
has been changed to include the datalist attribute. eg.
<input id="FromDistrict" name="FromDistrict" style="width: 200px" type="text" datalist="DistrictName" />
..... and further down the DOM, you should see the datalist element that has been injected into the DOM by the Kendo code.
<datalist id="DistrictName">
<option value="Kent">Kent</option>
<option value="Surry">Surry</option>
</datalist>
You should be able to automate the field, by just assigning a valid comma-separated list to the FromDistrict.
eg. FromDistrict.value='Kent, Surry';
I was able to accomplish what I wanted in a crude sort of way by using the following, but I'm working on a better more efficient way.
'Choose the FROM district
ie.Document.parentWindow.execScript "$('#FromDistrict').kendoDropDownList('open');"
Dim FrDist, li
Set FrDist = ie.Document.getElementById("FromDistrict-list").getElementsByTagName("li")
Dim fd
fd = 0
For Each li In FrDist
'MsgBox ("li.innertext is - " & li.innerText & " fd value is: " & fd)
If li.innerText Like "*KILGORE*" Then
FrDist(fd).Click
Else
'Do Nothing
End If
fd = fd + 1
'Application.Wait (Now + TimeValue("0:00:02"))
Next
Application.Wait (Now + TimeValue("0:00:02"))
'Choose the TO district
ie.Document.parentWindow.execScript "$('#ToDistrict').kendoDropDownList('open');"
Dim ToDist
Set ToDist = ie.Document.getElementById("ToDistrict-list").getElementsByTagName("li")
Dim tod
tod = 0
For Each li In ToDist
'MsgBox ("li.innertext is - " & li.innerText & " tod value is: " & tod)
If li.innerText Like "*KILGORE*" Then
ToDist(tod).Click
Else
'Do Nothing
End If
tod = tod + 1
Next

Resources