I was trying go gather/scrape data from the Web using this code:
Sub GetSP()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.Navigate "http://uk.investing.com/currencies/streaming-forex-rates-majors"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
RunCodeEveryX:
Set allRowOfData = appIE.document.getElementById("pair_1")
Dim myValue As String: myValue = allRowOfData.Cells(2).innerHTML
Range("A1").Value = myValue
Application.Wait Now + TimeValue("00:00:01")
GoTo RunCodeEveryX
appIE.Quit
Set appIE = Nothing
End Sub
However, when the code is running, I can't even edit the Excel because Excel seems to be busy working on getting the data. What I hope for was the code is running, I can do something out of the same sheet with the web scraping continuing.
Is there any alternative to wait now? (Which I think makes Excel busy)
Thanks!
#jeeped - I was able to gather the appropriate data using your preferred mode and successfully extract the data. I wonder if there is a good way to repeat this step infinitely (since the data is refreshing on the webpage, I'd like this to repeat as with my initial code) until I stop it while being able to edit the rest of the worksheet.
Thanks! Hope you don't mind me addressing you specifically though the question is open to everyone.
Sub GetSP()
Dim HTMLDoc As New HTMLDocument
Dim oHttp As MSXML2.xmlHTTP
On Error Resume Next
Set oHttp = New MSXML2.xmlHTTP
If Err.Number <> 0 Then
Set oHttp = CreateObject("MSXML.XMLHTTPRequest")
MsgBox "Error 0 has occured"
End If
On Error GoTo 0
If oHttp Is Nothing Then
MsgBox "Just cannot make"
Exit Sub
End If
oHttp.Open "GET", "http://uk.investing.com/currencies/streaming-forex-rates-majors", False
oHttp.send
HTMLDoc.body.innerHTML = oHttp.responseText
With HTMLDoc
PriceGetter = .getElementById("pair_1").innerText
PriceGetter2 = .getElementsByClassName("pid-1-bid")(0).innerText
Range("A1").Value = PriceGetter
Range("A2").Value = PriceGetter2
End With
End Sub
You can use a Loop instead of the Application.Wait. When you are using DoEvents inside it, the App is still responsive.
If you are on Windows, here is a Function that sleeps for a certain amount of time:
Declare Function GetTickCount Lib "kernel32.dll" () As Long
Function Sleep(milliseconds As Long)
Dim NowTick As Long
Dim EndTick As Long
EndTick = GetTickCount + (milliseconds)
Do
NowTick = GetTickCount
DoEvents
Loop Until NowTick >= EndTick
End Function
Call it like this:
Sleep 1000 'Sleeps for 1 Second
Related
The following VBA code returns
Run-time error '462' The remote server machine does not exist or is unavailable
citing the line .getElementById("txt"name").Value = "Arun Banik".
This code is from here but I have a similar objective.
Option Explicit
Const sSiteName = "https://www.encodedna.com/css-tutorials/form/contact-form.htm"
Private Sub CommandButton1_Click()
Dim oIE As Object
Dim oHDoc As HTMLDocument
Set oIE = CreateObject("InternetExplorer.Application")
' Open Internet Explorer Browser and keep it visible.
With oIE
.Visible = True
.Navigate sSiteName
End With
While oIE.ReadyState <> 4
DoEvents
Wend
Set oHDoc = oIE.Document
With oHDoc
.getElementById("txtName").Value = "Arun Banik"
.getElementById("txtAge").Value = "35"
.getElementById("txtEmail").Value = "arun#hotmail.com"
.getElementById("selCountry").Value = "India" ' Assign value to the dropdown list in the web form.
.getElementById("msg").Value = "Hi, I am Arun Banik and this is a test message. :-)"
.getElementById("bt").Click
End With
End Sub
There's isn't too much wrong with the code you posted. What I think is happening is the code isn't detecting it's starting to load in the While loop. So I added a small delay to try and resolve this. Also, I didn't see a need for the document variable, so I removed it. This code is working for me.
Option Explicit
Const sSiteName = "https://www.encodedna.com/css-tutorials/form/contact-form.htm"
Private Sub CommandButton1_Click()
Dim oIE As Object
Set oIE = CreateObject("InternetExplorer.Application")
' Open Internet Explorer Browser and keep it visible.
With oIE
.Visible = True
.Navigate sSiteName
End With
'Wait a few seconds for the page to load
Application.Wait (Now + TimeValue("0:00:02"))
While oIE.ReadyState <> 4
DoEvents
Wend
With oIE.Document
.getElementById("txtName").Value = "Arun Banik"
.getElementById("txtAge").Value = "35"
.getElementById("txtEmail").Value = "arun#hotmail.com"
.getElementById("selCountry").Value = "India"
.getElementById("msg").Value = "Hi, I am Arun Banik and this is a test message. :-)"
.getElementById("bt").Click
End With
End Sub
It could be that you do not have the proper references set.
Add "Microsoft Internet Control" and "Microsoft HTML Object Library" to your list of references (Alt + F11 > Tools > References...)
i am trying to extract one figure from a gov website, I have done a lot of googling and I am kinda lost for ideas, my code below returns a figure but it isnt the figure I want to get and I am not entirely sure why.
I want to subtract the figure from the 'Cases by Area (Whole Pandemic)' table 'Upper tier LA' section and 'Southend on Sea' Case number.
https://coronavirus.data.gov.uk/details/cases
I stole this code from online somewhere and tried to replicate with my class number I found within F12 section on the site.
Sub ExtractLastValue()
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 600
objIE.Visible = True
objIE.Navigate ("https://coronavirus.data.gov.uk/details/cases")
Do
DoEvents
Loop Until objIE.readystate = 4
MsgBox objIE.document.getElementsByClassName("sc-bYEvPH khGBIg govuk-table__cell govuk-table__cell--numeric ")(0).innerText
Set objIE = Nothing
End Sub
Data comes from the official API and returns a json response dynamically on that page when you click the Upper Tier panel.
Have a look and play with the API guidance
here:
https://coronavirus.data.gov.uk/details/developers-guide
You can make a direct xhr request by following the guidance in the API documentation and then using a json parser to handle the response. For your request it would be something like the following:
https://coronavirus.data.gov.uk/api/v1/data?filters=areaName=Southend-on-Sea&areaType=utla&latestBy=cumCasesByPublishDate&structure=
{"date":"date", "areaName":"areaName","cumCasesByPublishDate":"cumCasesByPublishDate",
"cumCasesByPublishDateRate":"cumCasesByPublishDateRate"}
XHR:
A worked example using jsonconverter.bas as the json parser
Option Explicit
Public Sub GetCovidNumbers()
Dim http As Object, json As Object
Set http = CreateObject("MSXML2.XMLHTTP")
With http
.Open "GET", "https://coronavirus.data.gov.uk/api/v1/data?filters=areaName=Southend-on-Sea&areaType=utla&latestBy=cumCasesByPublishDate&structure={""date"":""date"",""areaName"":""areaName"",""cumCasesByPublishDate"":""cumCasesByPublishDate"",""cumCasesByPublishDateRate"":""cumCasesByPublishDateRate""}", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Set json = JsonConverter.ParseJson(.responseText)("data")(1)
End With
With ActiveSheet
Dim arr()
arr = json.Keys
.Cells(1, 1).Resize(1, UBound(arr) + 1) = arr
arr = json.Items
.Cells(2, 1).Resize(1, UBound(arr) + 1) = arr
End With
End Sub
Json library (Used in above solution):
I use jsonconverter.bas. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
Internet Explorer:
You could do a slower, more complicated, internet explorer solution where you need to select the utla option when present, then select from the table the desired value:
Option Explicit
Public Sub GetCovidNumbers()
'Tools references Microsoft Internet Controls and Microsoft HTML Object Library
Dim ie As SHDocVw.InternetExplorer, t As Date, ele As Object
Const MAX_WAIT_SEC As Long = 10
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://coronavirus.data.gov.uk/details/cases"
While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
t = Timer 'timed loop for element to be present to click on (to get utla)
Do
On Error Resume Next
Set ele = .Document.querySelector("#card-cases_by_area_whole_pandemic [aria-label='Upper tier LA']")
On Error GoTo 0
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop While ele Is Nothing
If ele Is Nothing Then Exit Sub
ele.Click
While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
Dim table As MSHTML.HTMLTable, datetime As String, result()
Set table = .Document.querySelector("table[download='cumCasesByPublishDate,cumCasesByPublishDateRate']")
datetime = .Document.querySelector("time").getAttribute("datetime")
result = GetDataForUtla("Southend-on-Sea", datetime, table)
With ActiveSheet
.Cells(1, 1).Resize(1, 4) = Array("Datetime", "Area", "Cases", "Rate per 100,000 population")
.Cells(2, 1).Resize(1, UBound(result) + 1) = result
End With
.Quit
End With
End Sub
Public Function GetDataForUtla(ByVal utla As String, ByVal datetime As String, ByVal table As MSHTML.HTMLTable) As Variant
Dim row As MSHTML.HTMLTableRow, i As Long
For Each row In table.Rows
If InStr(row.outerHTML, utla) > 0 Then
Dim arr(4)
arr(0) = datetime
For i = 0 To 2
arr(i + 1) = row.Children(i).innerText
Next
GetDataForUtla = arr
Exit Function
End If
Next
GetDataForUtla = Array("Not found")
End Function
References:
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors
https://developer.mozilla.org/en-US/docs/Web/API/Document/querySelector
I have written a VBA to scrape the status of a shipment from a cargo tracking site with the help of you guys here. I am trying to convert it to a function. The code works as a sub but does not work as a function. It returns a #Value error. Can someone please tell me what I am doing wrong.
Here is the code as a sub
Sub FlightStat_AFL()
Dim url As String
Dim ie As Object
Dim MAWBStatus As String
Dim MAWBNo As String
MAWBNo = Sheets("Sheet3").Range("H3").Value
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-" & MAWBNo
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MAWBStatus = ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innertext
Debug.Print MAWBStatus
'Clean up
ie.Quit
Set ie = Nothing
End Sub
Here is the code I am trying to make it work as a function.
Function FlightStat_AF(MAWBNo As Range) As String
Dim url As String
Dim ie As Object
Dim MAWBStatus As String
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-" & MAWBNo
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MAWBStatus = ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innertext
FlightStat_AF = MAWBStatus
'Clean up
ie.Quit
Set ie = Nothing
End Function
Try the next code, please:
Function FlightStat_AF(cargoNo As Variant) As String
Dim url As String, ie As Object, result As String
url = "https://www.afklcargo.com/mycargo/shipment/detail/" & cargoNo
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = False
.navigate url
Do Until .readyState = 4: DoEvents: Loop
End With
'wait a little for dynamic content to be loaded
Application.Wait (Now + TimeSerial(0, 0, 1))
'Get the status from the table
Do While result = ""
DoEvents
On Error Resume Next
result = Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
On Error GoTo 0
Application.Wait (Now + TimeSerial(0, 0, 1))
Loop
ie.Quit: Set ie = Nothing
'Return value of the function
FlightStat_AF = result
End Function
IE Function
You can try this if you really want a range. Usually it should be a string which you can easily change.
You can test the function (2nd procedure) with the first procedure. Just adjust the values in the constants section.
The Code
Option Explicit
Sub getFlightStat()
' Constants
Const wsName As String = "Sheet3"
Const FirstRow As Long = 3
Const CritCol As Variant = "H"
Const ResCol As Variant = "I"
Dim wb As Workbook: Set wb = ThisWorkbook
' Define worksheet.
Dim ws As Worksheet
Set ws = wb.Worksheets(wsName)
' Calculate the row of the last non-blank cell in column 'CritCol'.
Dim LastRow As Long
LastRow = ws.Cells(ws.Rows.Count, CritCol).End(xlUp).Row
' Loop through rows and for each value in cell of column 'CritCol',
' write the value retrieved via 'FlightStat_AF' to the cell
' in the same row, but in column 'ResCol'.
Dim i As Long
For i = FirstRow To LastRow
ws.Cells(i, ResCol).Value = FlightStat_AF(ws.Cells(i, CritCol))
Next i
' Inform user.
MsgBox "Data transferred", vbInformation, "Success"
End Sub
Function FlightStat_AF(MAWBNo As Range) As String
Dim url As String
Dim ie As Object
Dim MAWBStatus As String
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-" & MAWBNo.Value
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MAWBStatus = ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innertext
FlightStat_AF = MAWBStatus
'Clean up
ie.Quit
Set ie = Nothing
End Function
I'm trying to write a script to pull doctor reviews from vitals.com and put them into an excel sheet.
It worked well when I just pulled the review, but when I added for it to pull the date as well, it will print the first review and date, then loads for a while, and then crashes. I'm new to all of this so I'm hoping there are some glaring mistakes I am not seeing. I just can't seem to find a way to fix it. Any help would be greatly appreciated.
Private Sub Worksheet_Change(ByVal Target As Range)
Dim DocCounter As Integer
DocCounter = 2
Dim Go As String
Go = "Go"
If IsEmpty(Cells(1, 4)) And Cells(1, 3).Value = Go Then
If IsEmpty(Cells(DocCounter, 1).Value) Then GoTo EmptySheet
Do
Dim Reviews As String
Reviews = "/reviews"
Dim IE As MSXML2.XMLHTTP60
Set IE = New MSXML2.XMLHTTP60
Application.Wait (Now + TimeValue("0:00:01"))
IE.Open "get", "http://vitals.com/doctors/" & Cells(DocCounter, 1).Value & Reviews, True
IE.send
While IE.readyState <> 4
DoEvents
Wend
Application.Wait (Now + TimeValue("0:00:01"))
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLBody As MSHTML.HTMLBody
Set HTMLDoc = New MSHTML.HTMLDocument
Set HTMLBody = HTMLDoc.body
HTMLBody.innerHTML = IE.responseText
Dim ReviewCounterString As String
Dim ReviewCounter As Integer
ReviewCounterString = HTMLDoc.getElementsByName("overall_total_reviews")(0).getElementsByTagName("h3")(0).innerText
ReviewCounter = CInt(ReviewCounterString)
'Pull info from website loop'
Dim RC As Integer
RC = 2
Dim sDD As String
Dim WebCounter As Integer
WebCounter = 0
Do
sDD = HTMLDoc.getElementsByClassName("date c_date dtreviewed")(WebCounter).innerText & "-" & HTMLDoc.getElementsByClassName("description")(WebCounter).innerText
Cells(DocCounter, RC).Value = sDD
WebCounter = WebCounter + 1
RC = RC + 1
Application.Wait (Now + TimeValue("0:00:01"))
Loop Until WebCounter = ReviewCounter
Application.Wait (Now + TimeValue("0:00:01"))
DocCounter = DocCounter + 1
If IsEmpty(Cells(DocCounter, 1).Value) Then GoTo Finished
Loop
Finished:
MsgBox ("Complete")
End Sub
EmptySheet:
MsgBox ("The Excel Sheet is Empty. Please add Doctors.")
End Sub
End If
End Sub
When you do Cells(DocCounter, RC).Value = sDD the Worksheet.Change event gets triggered again and the macro starts over again, until the call stack is full (I think).
Add
Application.EnableEvents = False
at the start of the macro and
Application.EnableEvents = True
at the end. That way the event will not be triggered during the macro.
Edit: You should probably also think about if it's really necessary to run the macro every time anything is changed anywhere on the sheet. You could check Target (the range that was changed) first to see if the change makes it necessary to reload the data.
I am trying to download a table of proprietary investments/positions/pricing from Nationwide. The code seems to do what I want, EXCEPT for producing an "object required" error when I attempt to select a particular account (click)
I thought I had the proper code to tell my macro to wait until IE was ready to go on, but clearly I am missing something.
In the code, the relevant line is highlighted. If I enter a STOP above the error line, I can wait until I "see" the link appear, then "continue" the code and it runs as expected.
Because this goes to my financial accounts, I cannot provide the user name and password to allow someone to replicate the exact problem, but here is the code, and the error message and highlight. Suggestions appreciated.
Option Explicit
'set Reference to Microsoft Internet Controls
Sub DownLoadFunds()
Dim IE As InternetExplorer
Dim sHTML
Const sURL As String = "https://www.nationwide.com/access/web/login.htm"
Const sURL2 As String = "https://isc.nwservicecenter.com/iApp/isc/app/ia/balanceDetail.do?basho.menuNodeId=12245"
Dim wsTemp As Worksheet
Set wsTemp = Worksheets("Scratch")
Set IE = New InternetExplorer
With IE
.Navigate sURL
.Visible = True 'for debugging
Do While .ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Do While .Busy = True
DoEvents
Loop
'Login: User Name and Password "remembered" by IE
.Document.all("submitButton").Click
Do While .ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Do While .Busy = True
DoEvents
Loop
'Select this account to show
.Document.all("RothIRA_#########").Click '<--Error at this line
Do While .ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Do While .Busy = True
DoEvents
Loop
.Navigate sURL2
Do While .ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Do While .Busy = True
DoEvents
Loop
Set sHTML = .Document.GetElementByID("fundByFundOnly")
With wsTemp
.Cells.Clear
.Range("a2") = sHTML.innertext
End With
.Quit
End With
Set IE = Nothing
End Sub
This is the error message:
This shows the highlighted line:
EDIT:
At Tim Williams suggestion, I added a loop to test for the presence of the desired element. This seems to work:
...
On Error Resume Next
Do
Err.Clear
DoEvents
Application.Wait (Time + TimeSerial(0, 0, 1))
.Document.getelementbyid("RothIRA_#########").Click
Loop Until Err.Number = 0
On Error GoTo 0
....
IE.Document.all("#RothIRA_....") is returning Nothing (null in more refined languages), so calling the Click method is causing the error.
Your code is the same as doing this:
Dim rothElement As Whatever
rothElement = IE.Document.all("#RothIRA_....")
rothElement.Click
...when you should do this:
Dim rothElement As Whatever
rothElement = IE.Document.all("#RothIRA_....")
If rothElement <> Nothing Then
rothElement.Click
End If
I suggest using the modern document.GetElementById method instead of the deprecated (if not obsolete) document.All API.
It's possible/likely that the page is using script to dynamically load some content or generate some layout after your "wait" loop has finished. That loop only waits until all linked content/resources have been loaded - it does not wait for scripts on the loaded page to finish, etc.
One approach is to loop your code waiting for the desired element to be rendered:
Const MAX_WAIT_SEC as Long = 5 'limit on how long to wait...
Dim t
t = Timer
Do While .Document.all("RothIRA_#########") Is Nothing
DoEvents
'or you can Sleep here
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop
'carry on...