I've written some VBA script to load a webpage then copy the entire html contents into a string, then select specific data from that string. In essence I search for a rail timetable, then copy out details for 5 journeys (departure time, interchanges, journey time & cost)
I have the above script sorted to do one search, but I now want to loop it and run approximately 300 searches. The issue I've found is that the script won't wait for the webpage to open, and therefore the string returned is empty, effectively returning nothing.
What I need to do is load an address, wait for the page to load, then continue the script. Do you have any suggestions? I've searched a lot and just haven't been able to sort, I've tried Application.Wait in a number of places and still no further ahead.
The code I'm using is below:
Sub CreateIE()
Dim tOLEobject As OLEobject
Dim NRADDRESS As String
NRADDRESS = Range("h11")
On Error Resume Next
Worksheets("Sheet1").Shapes.Range(Array("WebBrow")).Delete
Set tOLEobject = Worksheets("Sheet1").OLEObjects.Add(ClassType:="Shell.Explorer.2",Link:=False, _
DisplayAsIcon:=False, Left:=0, Top:=15, Width:=912, Height:=345)
For Each tOLEobject In Worksheets("Sheet1").OLEObjects
If tOLEobject.Name = "WebBrowser1" Then
With tOLEobject
.Left = 570
.Top = 1
.Width = 510
.Height = 400
.Name = "WebBrow"
End With
With tOLEobject.Object
.Silent = True
.MenuBar = False
.AddressBar = False
.Navigate NRADDRESS
End With
End If
Next tOLEobject
Sheets("Sheet2").Activate
Sheets("Sheet1").Activate
Call ReturnText
End Sub
NRADDRESS is a web address made up of a number of different parameters (origin, destination, date and time)
The "Call ReturnText" is the script I use to copy the website HTML into a string and extract what I want.
In that case, you might try something like this:
Set objIE = CreateObject("InternetExplorer.Application")
objIE.navigate strURL
Do While objIE.readyState <> 4 And objIE.Busy
DoEvents
Loop
which, I believe, requires a reference to Microsoft Internet Controls.
When I first started using VBA to load webpages, I also used the IE Object, but later found it creates all kinds of complications I didn't need, when all I really wanted was to download the file. Now I always use URLDownloadToFile.
A good example of it's use can be found here:
VBA - URLDownloadToFile - Data missing in downloaded file
Related
I have some questions reagrding an Excel VBA program that I want to build.
Basically it's pretty easy. I want to access the following website https://coronavirus.jhu.edu/map.html
and extract the Confirmed Cases by Country/Region/Sovereignty (it's the table on the very left of the dashborad) and paste the values in excel.
I know all the basic stuff on how to setup an internetexplorer instance and scraping the page by tags, classes, ids etc.
But I think in this sceanrio I cannot use the basic things. I guess it's pretty tricky actually.
The information I am looking for is within some tags. But I cannot get their textcontent when I use the getelementsbytagname("strong") approach.
Could someone help me in this case?
I am grateful for any hints, advices and solutions.
Below you'll find the start of my code.
Best
Simon
Sub test()
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim i As Integer
Dim obj_coll As IHTMLElementCollection
Dim obj As HTMLObjectElement
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate "https://coronavirus.jhu.edu/map.html"
Do Until ie.readyState = READYSTATE_COMPLETE
DoEvents
Loop
Debug.Print "Successfully connected with host"
Set html = ie.document
Set obj_coll = html.getElementsByTagName("strong")
For Each obj In obj_coll
Debug.Print obj.innerText
Next obj
ie.Quit
End Sub
You can use the iframe url direct to navigate to. You then need a timed wait to ensure the data has loaded within that iframe. I would then collect nodeLists via faster css selectors. As the nodeLists (one for figures and the other for locations) are the same length you will only need a single loop to index into both lists to get rows of data.
Option Explicit
Public Sub GetCovidFigures()
Dim ie As SHDocVw.InternetExplorer
Set ie = New SHDocVw.InternetExplorer
Dim t As Date
Const MAX_WAIT_SEC As Long = 30
With ie
.Visible = True
.Navigate2 "https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6"
Do
DoEvents
Loop While .Busy Or .readyState <> READYSTATE_COMPLETE
t = Timer
Do
If Timer - t > MAX_WAIT_SEC Then Exit Sub
Loop While .document.querySelectorAll(".feature-list strong").Length = 0
Dim figures As Object, location As Object, results(), i As Long
Set figures = .document.querySelectorAll("h5 strong")
Set location = .document.querySelectorAll("h5 span:last-child")
ReDim results(1 To figures.Length, 1 To 2)
For i = 0 To figures.Length - 1
results(i + 1, 1) = figures.item(i).innerText
results(i + 1, 2) = location.item(i).innerText
Next
.Quit
End With
ActiveSheet.Cells(1, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End Sub
Consider how frequently you want this. There are large numbers of APIs popping up to supply this data which you could instead issue faster xhr requests to. Additionally, you could simply take the source data in csv form from github here. *Files after Feb 1 (UTC): once a day around 23:59 (UTC). There is a rest API visible in dev tools network tab that is frequently supplying new data in json format which is used to update the page. That can be accessed via Python + requests or R + httr modules for example. I suspect this endpoint is not intended to be hit so look for public APIs.
I want to take values from an excel sheet and store them in an array. I then want to take the values from the array and use them to fill the web form.
I have managed to store the values in the array and I have managed to get VBA to open Internet Explorer (IE)
The code runs and no errors appear, but the text fields are not being populated, nor is the button being clicked
(The debugger points to [While .Busy] as the error source, located in the WITH block)
How do I go about filling the form (that has a total of 3 text boxes to fill)?
There is also a drop down menu that I need to choose a value from, but I need to fill the text boxes prior to moving on to that part of the task.
Sub CONNECT_TO_IE()
the_start:
Dim ie As Object
Dim objElement As Object
Dim objCollection As Object
acct = GET_CLIENT_NAME()
name = GET_CODE()
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate ("<<my_website>>")
ie.FullScreen = False
On Error Resume Next
Do
DoEvents
If Err.Number <> 0 Then
ie.Quit
Set ie = Nothing
GoTo the_start:
End If
Loop Until ie.readystate = 4
Application.Wait Now + TimeValue("00:00:10")
ie.Document.getElementbyid("<<field_1>>").Value = "PPP"
ie.Document.getElementbyid("<<field_2>>").Value = "PPP"
ie.Document.getElementbyid("<<field_3>>").Click
Set ie = Nothing
End Sub
UPDATE: Turns out the reason this wasn't working is because there are some settings in the HTML of the site that do not allow for the automation to occur, so any code versions I had were correct but they were doomed to fail. So you were correct in that regard #TimWilliams.
I know this because the website I was trying to access is on a secure server/virtual machine. I edited the code to fill in the google search bar and it did not work on the virtual machine however when I ran the same code locally, it worked fine.
I have around 100 websites in an excel for which i need to check the time taken for website to load.
Currently I am manually doing it in Internet Explorer by using developer tool(F12)> Network> Taken(Column) one by one.
Is there any way to do this check automatically??
Not sure your level of VBA, and this is far from perfect, but as a starting point this is a code I normally use to scrape websites, but I have modified slightly to time how long it takes each website to open. This assumes you have all your websites in column A, it will print the time taken to open a website in column B. Bare in mind this is timing the time the macro takes to run per website, so not entirely accurate to how long the actual site takes to load.
Enum READYSTATE
READYSTATE_UNINITIALIZED = 0
READYSTATE_LOADING = 1
READYSTATE_LOADED = 2
READYSTATE_INTERACTIVE = 3
READYSTATE_COMPLETE = 4
End Enum
Sub ImportWebsiteDate()
'to refer to the running copy of Internet Explorer
Dim ie As InternetExplorer, WDApp As Object, staTime As Double, elapsedTime As Double
'to refer to the HTML document returned
Dim html As HTMLDocument, websiteNo As Integer, curSite As String
websiteNo = Range("A500").End(xlUp).Row
'open Internet Explorer in memory, and go to website
Set ie = New InternetExplorer
For i = 1 To websiteNo
curSite = Cells(i, 1).Value
ie.Visible = False
staTime = Timer
ie.navigate curSite
'Wait until IE is done loading page
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to " & curSite
DoEvents
Loop
elapsedTime = Round(Timer - staTime, 2)
Cells(i, 2).Value = elapsedTime
Next i
Set ie = Nothing
Application.StatusBar = False
Application.ScreenUpdating = False
End Sub
This won't just work if you copy and paste, read this site for info on referencing the required applications.
I want to copy into Excel 3 tracking information tables that website generates when I track a parcel. I want to do it through Excel VBA. I can write a loop and generate this webpage for various tracking numbers. But I am having a hard time copying tables - the top table, travel history and shipments track table. Any solution? In my vba code last 3 lines below are giving an error :( - run time error '438' Object doesn't support this property or error.
Sub final()
Application.ScreenUpdating = False
Set ie = CreateObject("InternetExplorer.Application")
my_url = "https://www.fedex.com/fedextrack/index.html?tracknumbers=713418602663&cntry_code=us"
With ie
.Visible = True
.navigate my_url
Do Until Not ie.Busy And ie.readyState = 4
DoEvents
Loop
End With
ie.document.getElementById("detailsBody").Value
ie.document.getElementById("trackLayout").Value
ie.document.getElementById("detail").Value
End Sub
.Value is not a method available in that context. also, you will want to assign the return value of the method call to a variable. Also, you should declare your variables :)
I made some modifications and include one possible way of getting data from one of the tables. YOu may need to reformat the output using TextToColumns or similar, since it prints each row in a single cell.
I also notice that when I execute this, the tables have sometimes not finished loading and the result will be an error unless you put in a suitable Wait or use some other method to determine when the data has fully loaded on the webpage. I use a simple Application.Wait
Option Explicit
Sub final()
Dim ie As Object
Dim my_url As String
Dim travelHistory As Object
Dim history As Variant
Dim h As Variant
Dim i As Long
Application.ScreenUpdating = False
Set ie = CreateObject("InternetExplorer.Application")
my_url = "https://www.fedex.com/fedextrack/index.html?tracknumbers=713418602663&cntry_code=us"
With ie
.Visible = True
.navigate my_url
'## I modified this logice a little bit:
Do While .Busy And .readyState <> 4
DoEvents
Loop
End With
'## Here is a simple method wait for IE to finish, you may need a more robust solution
' For assistance with that, please ask a NEW question.
Application.Wait Now() + TimeValue("0:00:10")
'## Get one of the tables
Set travelHistory = ie.Document.GetElementByID("travel-history")
'## Split teh table to an array
history = Split(travelHistory.innerText, vbLf)
'## Iterate the array and write each row to the worksheet
For Each h In history
Range("A1").Offset(i).Value = h
i = i + 1
Next
ie.Quit
Set ie = Nothing
End Sub
There are many online resources that illustrate using Microsoft Internet Explorer Controls within VBA Excel to perform basic IE automation tasks. These work when the webpage has a basic construct. However, when webpages contain multiple frames they can be difficult to work with.
I need to determine if an individual frame within a webpage has completely loaded. For example, this VBA Excel code opens IE, loads a webpage, loops thru an Excel sheet placing data into the webpage fields, executes search, and then returns the IE results data to Excel (my apologies for omitting the site address).
The target webpage contains two frames:
1) The searchbar.asp frame for search value input and executing search
2) The searchresults.asp frame for displaying search results
In this construct the search bar is static, while the search results change according to input criteria. Because the webpage is built in this manner, the IEApp.ReadyState and IEApp.Busy cannot be used to determine IEfr1 frame load completion, as these properties do not change after the initial search.asp load. Therefore, I use a large static wait time to avoid runtime errors as internet traffic fluctuates. This code does work, but is slow. Note the 10 second wait after the cmdGO statement. I would like to improve the performance by adding solid logic to determine the frame load progress.
How do I determine if an autonomous frame has finished loading?
' NOTE: you must add a VBA project reference to "Internet Explorer Controls"
' in order for this code to work
Dim IEapp As Object
Dim IEfr0 As Object
Dim IEfr1 As Object
' Set new IE instance
Set IEapp = New InternetExplorer
' With IE object
With IEapp
' Make visible on desktop
.Visible = True
' Load target webpage
.Navigate "http://www.MyTargetWebpage.com/search.asp"
' Loop until IE finishes loading
While .ReadyState <> READYSTATE_COMPLETE
DoEvents
Wend
End With
' Set the searchbar.asp frame0
Set IEfr0 = IEapp.Document.frames(0).Document
' For each row in my worksheet
For i = 1 To 9999
' Input search values into IEfr0 (frame0)
IEfr0.getElementById("SearchVal1").Value = Cells(i, 5)
IEfr0.getElementById("SearchVal2").Value = Cells(i, 6)
' Execute search
IEfr0.all("cmdGo").Click
' Wait a fixed 10sec
Application.Wait (Now() + TimeValue("00:00:10"))
' Set the searchresults.asp frame1
Set IEfr1 = IEapp.Document.frames(1).Document
' Retrieve webpage results data
Cells(i, 7) = Trim(IEfr1.all.Item(26).innerText)
Cells(i, 8) = Trim(IEfr1.all.Item(35).innerText)
Next
As #JimmyPena said. it's a lot easier to help if we can see the URL.
If we can't, hopefully this overview can put you in the right direction:
Wait for page to load (IEApp.ReadyState and IEApp.Busy)
Get the document object from the IE object. (done)
Loop until the document object is not nothing.
Get the frame object from the document object.
Loop until the frame object is not nothing.
Hope this helps!
I used loop option to check the field value until its populated like this
Do While IE.Document.getElementById("USERID").Value <> "test3"
IE.Document.getElementById("USERID").Value = "test3"
Loop
this is a Rrrreeally old thread, but I figured I would post my findings, because I came here looking for an answer...
Looking in the locals window, I could see that the "readystate" variable was only "READYSTATE_COMPLETE" for the IE App itself. but for the iframe, it was lowercase "complete"
So I explored this by using a debug.print loop on the .readystate of the frame I was working with.
Dim IE As Object
Dim doc As MSHTML.HTMLDocument
Set doc = IE.Document
Dim iframeDoc As MSHTML.HTMLDocument
Set iframeDoc = doc.Frames("TheFrameIwasWaitingFor").Document
' then, after I had filled in the form and fired the submit event,
Debug.Print iframeDoc.readyState
Do Until iframeDoc.readyState = "complete"
Debug.Print iframeDoc.readyState
DoEvents
Loop
So this will show you line after line of "loading" in the immediate window, eventually showing "complete" and ending the loop. it can be abridged to remove the debug.prints of course.
another thing:
debug.print iframeDoc.readystate ' is the same as...
debug.print doc.frames("TheFrameIwasWaitingFor").Document.readystate
' however, you cant use...
IE.Document.frames("TheFrameIwasWaitingFor").Document.readystate ' for some reason...
forgive me if all of this is common knowledge. I really only picked up VBA scripting a couple days ago...