VBA web scraping - excel

I am trying to get a row of data from this table on this website: http://www.nasdaq.com/symbol/neog/financials?query=balance-sheet
Now I can manage to get the "total liabilities" row using the
doc.getelementsbyclassname("net")(3).innertext
but I cannot figure out how to get any other rows of data such as common stock.
Private Sub Worksheet_Change(ByVal Target As Range)
If Target.Row = Range("bscode").Row And _
Target.Column = Range("bscode").Column Then
Dim IE As New InternetExplorer
IE.Visible = True
IE.navigate "http://www.nasdaq.com/symbol/" & Range("bscode").Value & "/financials?query=balance-sheet&data=quarterly"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.document
Dim sD As String
sD = Doc.getElementsByTagName("tr")(8).innerText
MsgBox sD
Dim aD As Variant
aD = Split(sD, "$")
Range("bs").Value = aD(1)
Range("ba").Value = aD(2)
Range("bb").Value = aD(3)
Range("bc").Value = aD(4)
End If
End Sub
If it helps, I have the HTML source and the tr highlighted that I want to grab.
screenshot of HTML code
The issue is the method of finding the table row data. Could someone please explain to me how to get other rows of data? It would be much appreciated !

I was able to do some trial and error and to get the correct reference this way:
Dim eTR As Object, cTR As Object, I as Integer 'I used object, because I did late binding
Set cTR = Doc.getElementsByTagName("tr")
i = 0
For Each eTR In cTR
If Left(eTR.innerText, 3) = "Com" Then
Debug.Print "(" & i; "): " & eTR.innerText
End If
i = i + 1
Next
The immediate window then displayed
(308): Common Stocks ... (a bunch of space) ...
$5,941$5,877$5,773$3,779
I then tested this statement:
sd = Doc.getElementsByTagName("tr")(308).innerText
Debug.Print sd
And got the same result.

Related

Using VBA to search web site and return information

I have not used excel VBA very much but I am trying to enter a part number into a cell and have it return the interchange information. I did some looking around and was able to come up with a way to return the info on a separate excel sheet where i can then copy and paste. However i was wondering if there was a way that i could have it move down column A gathering the info for each part number and putting it into the rows for each interchange.
This is what i have currently:
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
If Target.Row = Range("ZF_number").Row And _
Target.Column = Range("ZF_number").Column Then
Dim IE As New InternetExplorer
'IE.Visible = True
IE.navigate "https://www.liftsupportsdepot.com/search/search-results/?search_query=" & Range("ZF_number").Value
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.document
Dim sDD As String
sDD = Trim(Doc.getElementsByTagName("span")(32).innerText)
IE.Quit
Dim aDD As Variant
aDD = Split(sDD, ",")
Range("Interchange_1").Value = aDD(0)
Range("Interchange_2").Value = aDD(1)
Range("Interchange_3").Value = aDD(2)
Range("Interchange_4").Value = aDD(3)
Range("Interchange_5").Value = aDD(4)
Range("Interchange_6").Value = aDD(5)
Range("Interchange_7").Value = aDD(6)
Range("Interchange_8").Value = aDD(7)
Range("Interchange_9").Value = aDD(8)
Range("Interchange_10").Value = aDD(9)
Range("Interchange_11").Value = aDD(10)
Range("Interchange_12").Value = aDD(11)
Range("Interchange_13").Value = aDD(12)
Range("Interchange_14").Value = aDD(13)
Range("Interchange_15").Value = aDD(14)
Range("Interchange_16").Value = aDD(15)
End If
End Sub

How to scrape data from Bloomberg's website with VBA

Background
Disclaimer: I am a beginner, please bare with my - most plausibly wrong - code.
I want to update currency pairs' value (PREV CLOSE) with a button-enabled-VBA macro. My Excel worksheet contains FX pairs (e.g. USDGBP) on column G:G which are then used to run a FOR loop for every pair in the column.
The value would then be stored in column I:I
Right now, the problem according to the Debugger lies in one line of code that I will highlight below
Sources
I got some inspiration from https://www.youtube.com/watch?v=JxmRjh-S2Ms&t=1050s - notably 17:34 onwards - but I want my code to work for multiple websites at the press of a button.
I have tried the following code
Public Sub Auto_FX_update_BMG()
Application.ScreenUpdating = False 'My computer is not very fast, thus I use this line of
'code to save some computing power and time
Dim internet_object As InternetExplorer
Dim i As Integer
For i = 3 To Sheets(1).Cells(3, 7).End(xlDown).Row
FX_Pair = Sheets(1).Cells(i, 7)
Set internet_object = New InternetExplorer
internet_object.Visible = True
internet_object.navigate "https://www.bloomberg.com/quote/" & FX_Pair & ":CUR"
Application.Wait Now + TimeValue("00:00:05")
internet_object.document.getElementsByClassName("class")(0).getElementsByTagName ("value__b93f12ea") '--> DEBUGGER PROBLEM
'My goal here is to "grab" the PREV CLOSE
'value from the website
With ActiveSheet
.Range(Cells(i, 9)).Value = HTML_element.Children(0).textContent
End With
Sheets(1).Range(Cells(i, 9)).Copy 'Not sure if these 2 lines are unnecesary
ActiveSheet.Paste
Next i
Application.ScreenUpdating = True
End Sub
Expected Result
WHEN I enter "USDGBP" on a cell on column G:G, the macro would go to https://www.bloomberg.com/quote/EURGBP:CUR and "grab" the PREV CLOSE value of 0.8732 (using today's value) and insert it in the respective row of column I:I
As of now, I am just facing the debugger without much idea on how to solve the problem.
You can use class selectors in a loop. The pattern
.previousclosingpriceonetradingdayago .value__b93f12ea
specifies to get child elements with class value__b93f12ea having parent with class previousclosingpriceonetradingdayago. The "." in front is a css class selector and is a faster way of selecting as modern browsers are optimized for css. The space between the two classes is a descendant combinator. querySelector returns the first match for this pattern from the webpage html document.
This matches on the page:
You can see the parent child relationship and classes again here:
<section class="dataBox previousclosingpriceonetradingdayago numeric">
<header class="title__49417cb9"><span>Prev Close</span></header>
<div class="value__b93f12ea">0.8732</div>
</section>
N.B. If you are a Bloomberg customer look into their APIs. Additionally, it is very likely you can get this same info from other dedicated APIs which will allow for much faster and more reliable xhr requests.
VBA (Internet Explorer):
Option Explicit
Public Sub test()
Dim pairs(), ws As Worksheet, i As Long, ie As Object
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set ie = CreateObject("InternetExplorer.Application")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With ie
.Visible = True
For i = LBound(pairs) To UBound(pairs)
.Navigate2 "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
While .Busy Or .readyState < 4: DoEvents: Wend
results(i) = .document.querySelector(".previousclosingpriceonetradingdayago .value__b93f12ea").innerText
Next
.Quit
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
For very limited numbers of requests (as leads to blocking) you could use xhr request and regex out the value. I assume pairs are in sheet one and start from G2. I also assume there are no empty cells or invalid pairs in column G up to an including last pair to search for. Otherwise, you will need to develop the code to handle this.
Try regex here
Option Explicit
Public Sub test()
Dim re As Object, pairs(), ws As Worksheet, i As Long, s As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set re = CreateObject("VBScript.RegExp")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With CreateObject("MSXML2.XMLHTTP")
For i = LBound(pairs) To UBound(pairs)
.Open "GET", "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
.send
s = .responseText
results(i) = GetCloseValue(re, s, "previousClosingPriceOneTradingDayAgo%22%3A(.*?)%2")
Next
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
Public Function GetCloseValue(ByVal re As Object, inputString As String, ByVal pattern As String) As String 'https://regex101.com/r/OAyq30/1
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
If .test(inputString) Then
GetCloseValue = .Execute(inputString)(0).SubMatches(0)
Else
GetCloseValue = "Not found"
End If
End With
End Function
Try below code:
But before make sure to add 2 reference by going to Tools> References > then look for Microsoft HTML Object Library and Microsoft Internet Controls
This code works upon using your example.
Sub getPrevCloseValue()
Dim ie As Object
Dim mySh As Worksheet
Set mySh = ThisWorkbook.Sheets("Sheet1")
Dim colG_Value As String
Dim prev_value As String
For a = 3 To mySh.Range("G" & Rows.Count).End(xlUp).Row
colG_Value = mySh.Range("G" & a).Value
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate "https://www.bloomberg.com/quote/" & colG_Value & ":CUR"
Do While ie.Busy: DoEvents: Loop
Do Until ie.readyState = 4: DoEvents: Loop
'Application.Wait (Now + TimeValue("00:00:03")) 'activate if having problem with delay
For Each sect In ie.document.getElementsByTagName("section")
If sect.className = "dataBox previousclosingpriceonetradingdayago numeric" Then
prev_value = sect.getElementsByTagName("div")(0).innerText
mySh.Range("I" & a).Value = prev_value
Exit For
End If
Next sect
Next a
I have a video tutorial for basic web automation using vba which include web data scraping and other commands, please check the link below:
https://www.youtube.com/watch?v=jejwXID4OH4&t=700s

web scraping using excel and VBA

i wrote my VBA code in excel sheet as below but it is not scrape data for me and also i don't know why please any one help me. it gave me reullt as "click her to read more" onlyi want to scrape enitre data such as first name last name state zip code and so on
Sub extractTablesData()
Dim IE As Object, obj As Object
Dim myState As String
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
myState = InputBox("Enter the city where you wish to work")
With IE
.Visible = True
.navigate ("http://www.funeralhomes.com/go/listing/Search? name=&city=&state=&country=USA&zip=&radius=")
While IE.readyState <> 4
DoEvents
Wend
For Each obj In IE.document.all.item("state").Options
If obj.innerText = myState Then
obj.Selected = True
End If
Next obj
IE.document.getElementsByValue("Search").item.Click
Do While IE.Busy: DoEvents: Loop
ThisWorkbook.Sheets("Sheet1").Range("A1:K1500").ClearContents
Set elemCollection = IE.document.getElementsByTagName("TABLE")
For t = 0 To (elemCollection.Length - 1)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
Set IE = Nothing
End Sub
Using the same URL as the answer already given you could alternatively select with CSS selectors to get the elements of interest, and use split to get just the names and address parts from the text. We can also do away with the browser altogether to get faster results from first results page.
Business name:
You can get the name with the following selector (using paid listing example):
div.paid-listing .listing-title
This selects (sample view)
Try
Address info:
The associated descriptive information can be retrieved with the selector:
div.paid-listing .address-summary
And then using split we can parse this into just the address information.
Code:
Option Explicit
Public Sub GetTitleAndAddress()
Dim oHtml As HTMLDocument, nodeList1 As Object, nodeList2 As Object, i As Long
Const URL As String = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", URL, False
.send
oHtml.body.innerHTML = .responseText
End With
Set nodeList1 = oHtml.querySelectorAll("div.paid-listing .listing-title")
Set nodeList2 = oHtml.querySelectorAll("div.paid-listing .address-summary")
With Worksheets("Sheet3")
.UsedRange.ClearContents
For i = 0 To nodeList1.Length - 1
.Range("A" & i + 1) = nodeList1.Item(i).innerText
.Range("B" & i + 1) = Split(nodeList2.Item(i).innerText, Chr$(10))(0)
Next i
End With
End Sub
Example output:
Yeah, without an API, this can be very tricky at best, and very inconsistent at worst. For now, you can try the script below.
Sub DumpData()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.document.all
If itm.classname Like "*free-listing*" Or itm.classname Like "*paid-listing*" Then
.Range("A" & RowCount) = itm.classname
.Range("B" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
End If
Next itm
End With
End Sub
You probably want some kind of input box to capture the city and state and radius from the user, or capture those variable in cells in your worksheet.
Notice, the '%20' is a space character.
I got this idea from a friend of mine, Joel, a long time ago. That guy is great!

Error Message as Object Required while using IE.Document.getElementById("info_window").innerText

I am very new to VBA programming & have been trying to capture the text that gets popup on the google map when I search using GPS Coordinates (Latitude and Longitude). I have been using the website www.gps-coordinates.net.
Till now I have been able to pass the GPS Coordinates to the website input boxes and got the button clicked and the address pops'up on the google map in a box. While I am trying to get the address from the box using IE.Document.getElementById("info_window").innerText, I get an error message
Runtime Error '424' Object Required
I am not able to understand which object the code is referring to. I have search the internet for a solution but couldn't find anyone facing similar kind of problem, so couldn't find a ready solution for this. The code snippet is mentioned below:
Sub GetPlaceName()
Dim btn As Variant
Dim IE As Object, Doc As Object
Dim sDD1 As String, sDD2 As String
Dim vAdd As Variant
Dim lat As Integer, longt As Integer, latrow As Integer, longrow As Integer
latrow = ActiveSheet.Cells(Rows.Count, "B").End(xlUp).Row - 1
longrow = ActiveSheet.Cells(Rows.Count, "C").End(xlUp).Row - 1
If latrow <> longrow Then
MsgBox "The number of Latitudes & Longitudes are not equal. " & vbNewLine & " PleaseCheck...."
Else
Range("B2").Select
For i = 2 To latrow
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.navigate "http://www.gps-coordinates.net/"
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
Set Doc = IE.Document
Doc.getElementById("latitude").Value = Range("B" & i).Cells.Value
Doc.getElementById("longitude").Value = Range("C" & i).Cells.Value
With Doc
Set elems = .getElementsByTagName("button")
For Each e In elems
If e.getAttribute("onclick") = "codeLatLng(1)" Then
e.Click
Exit For
End If
Next e
End With
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
sDD1 = IE.Document.getElementById("info_window").innerText
vAdd = Split(sDD1, vbNewLine)
Cells(i, "E").Value = vAdd
IE.Quit
Set IE = Nothing
Set Doc = Nothing
Next i
End If
End Sub
I believe that there must be a very small error which I am not able to figure out. It must also appear to be very silly to some of you but I really need help to fix this. Hope to receive reply soon... Thanks in advance :-)
Try using
Set IE = new SHDocVw.InternetExplorer
instead of
Set IE = CreateObject("InternetExplorer.Application")
this never failed to work for me.

Permission denied when trying to draw data from a table in IE

I have just recently started looking at applications of VBA in Excel accessing web pages through IE, and have no experience with html coding, so the solution to this might be really simple...
I have a section of code (below) that is supposed to navigate to a website, access a table and pull out the data into excel. However, at seemingly random times, for no reason that I can determine, the Object Variable 'TDelement' becomes locked somehow, and Excel throws up an Error 70: Permission Denied when I try to access the next cell through the loop. It doesn't happen all the time, and it doesn't happen on the same table cell.
Dim IE As Object
Dim TDElements As Object
Dim TDelement As Object
Dim Web_Address As String
Dim DteTm As Date
Web_Address = "http://www.bom.gov.au/fwo/IDQ65388/IDQ65388.040762.tbl.shtml"
' Access the Webpage
IE.Navigate Web_Address
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
' Find and Set Data Table Cells/object within webpage
Set TDElements = IE.document.GetElementsByTagName("td")
' Pull each TDElement (table cell) from TDElements
Rw = 1
Col = 2
For Each TDelement In TDElements
If Col = 1 Then
Col = 2
ElseIf Col = 2 Then
Col = 1
End If
If Col = 1 Then
DteTm = TDelement.innerText
Worksheets(1).Cells(Rw, Col).Value = DteTm
ElseIf Col = 2 Then
Worksheets(1).Cells(Rw, Col).Value = TDelement.innerText
End If
If Col = 2 Then
Rw = Rw + 1
End If
Next
If the error is going to occur within a cycle of the loop, it occurs on either
DteTm = TDelement.innerText or
Worksheets(1).Cells(Rw, Col).Value = TDelement.innerText,
dependant on the outcome of the If...Then statement, obviously.
After a bit of googling, the general concensus seemed to be that error 70 is related to naming conflicts with variables (ie trying to use the same variable name twice). Because of this I tried adding Set TDelement = Nothing before Next to clear the variable at the end of each loop, but it didn't resolve the issue (not all that surprising; I have never had an issue with variables in loops like this before).
Could it have something to do with .innerText? Even though it is mentioned on just about every forum post that I have seen with regards to pulling data from IE, it isn't mentioned in the Excel help files at all...
Any help on this would be greatly appreciated.
Try below code :
Sub sample()
Dim IE As Object
Dim Web_Address As String
Dim tblTR As Object
Dim tblTD As Object
Set IE = CreateObject("internetexplorer.application")
Web_Address = "http://www.bom.gov.au/fwo/IDQ65388/IDQ65388.040762.tbl.shtml"
' Access the Webpage
IE.Navigate Web_Address
IE.Visible = True
Start:
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 5, Now)
Loop
' Find and Set Data Table Cells/object within webpage
Set tblTR = IE.document.GetElementsByTagName("tr")
If tblTR Is Nothing Then GoTo Start
Dim i As Integer
i = 1
For Each tblTD In tblTR
If Not tblTD Is Nothing Then
Worksheets(1).Cells(i, 1).Value = tblTD.all(0).innerText
Worksheets(1).Cells(i, 2).Value = tblTD.all(1).innerText
End If
i = i + 1
Next
End Sub

Resources