Morningstar expected return - excel

I have this code that I have tweaked below. I use it to scrape other morningstar data, but I can't seem to make it work now for "expected return" for ETFs(Exchange Traded Funds). Everything on the code right now is set up to get the data that I need but I am having a problem getting it on the excel spreadsheet. When I do a msgBox tblTR under the code:
Set tblTR = Doc.getElementsByClassName("pr_text3")(4).innerText
I get the expected value on the message box.
However, when I take the msgbox code out, the value doesn't appear in the excel spreadsheet. I have been trying to work it out for hours now and need HELP!
Below is the entire code. under tab "Tickers2" is where I have all the tickers I would like to pull data. Examples JKE, JKF, JKD...which I have about 1000. under tab "ExpectedReturn" is where I want the data to be displayed. I think it has to do with me pulling elementsbyclassname versus when I used to pull the elementsbytagname. There wasn't in tagnames in the information i needed so I switched it to class name. Below is the entire code.
I will also mention that you have to be signed in to morningstar.com in order to get the actual data, but I am assuming that the forum can point me in the right direction without needing to be signed in.
The website is www.morningstar.com
Sub ExpectedReturn()
Dim IE As Object, Doc As Object, lastRow As Long, tblTR As Object, tblTD As Object, strCode As String
lastRow = Range("A65000").End(xlUp).Row
Set IE = CreateObject("internetexplorer.application")
IE.Visible = True
last_row = Sheets("Tickers2").Range("A1").End(xlDown).Row
ini_row_dest = 1
Sheets("ExpectedReturn").Select
Sheets("ExpectedReturn").Range("A1:H10000").ClearContents
Application.ScreenUpdating = True
For i = 1 To lastRow
Application.StatusBar = "Updating upDown" & i & "/" & last_row
row_dest = ini_row_dest + (i - 1)
strCode = "Tickers2" ' Range("A" & i).value
list_symbol = Sheets("Tickers2").Range("A" & i)
IE.navigate "http://etfs.morningstar.com/quote?t=" & list_symbol
Do While IE.readyState <> 4: DoEvents: Loop
Set Doc = CreateObject("htmlfile")
Set Doc = IE.document
tryAgain:
Set tblTR = Doc.getElementsByClassName("pr_text3")(4).innerText
MsgBox tblTR
If tblTR Is Nothing Then GoTo tryAgain
On Error Resume Next
j = 2
For Each tblTD In tblTR.getElementsByTagName("td")
tdVal = Split(tblTD.innerText, vbCrLf)
Cells(i, j) = tdVal(0)
Cells(i, j + 1) = tdVal(1)
j = j + 2
Next
Sheets("ExpectedReturn").Range("A" & row_dest).Value = list_symbol
Next i
Range("A3").Select
Application.StatusBar = False
Application.Calculation = xlAutomatic
End Sub
Thank you in advance.
-Eddie

By setting
Set tblTR = Doc.getElementsByClassName("pr_text3")(4).innerText
the variable tblTR is a string. You want a dom element, so remove the .innerText
Only then you can loop over its TD-children further down.

This was my fix
tblTR=Doc.ElementsByClassName("pr_text3)(4).innerText
Sheets("ExpectedReturn").Range("B"& row_dest).Value=tblTR

Related

Scrape product specification value [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 days ago.
Improve this question
I am trying to scrape the product specifications from Amazon using VBA.
HTML Page to scrape: https://www.amazon.in/dp/B01FXJI1OY
My two broad requirements are:
1)Break the product title to get certain specs
2) Get rest of the specs from Bullet points (BP) mentioned on page
Solution I have thought of (please suggest, if you think there is a better way to do this):
Use text identifiers (which are the specs value or text which come after the spec value):
My current code is able to fetch product title. It is also fetching the bullet points matching to a value stored in cell (2,2). Please help how can I fetch the value of specs using the identifier (which are multiple for some of the specs like month/year for warranty):
Sub GetchDetails()
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Application.EnableEvents = False
Dim IE As Object ' InternetExplorer.Application
Dim url As String
Dim sh As Worksheet
Dim rw As Range
ThisWorkbook.Sheets("Crawler").Activate
Set sh = ActiveSheet
Set IE = CreateObject("InternetExplorer.Application")
' IE.Visible = True
url = "https://amazon.in/dp/B01FXJI1OY"
On Error Resume Next
IE.Navigate2 url
Do While IE.Busy = True Or IE.readystate <> 4
DoEvents
Loop
Set HTMLDoc = IE.document
Application.Wait (Now + TimeValue("0:00:01"))
Option Compare Text
Set itm = HTMLDoc.getElementById("productTitle")
Cells(rw.Row, 3).Value = itm.innertext
Set itm = HTMLDoc.getElementsByClassName("a-unordered-list a-vertical a-spacing-none")(0)
i = 0
For Each Item In itm.getElementsByTagName("li")
If LCase(Item.innertext) Like "*" & LCase(Cells(2, 2)) & "*" Then
Cells(rw.Row, 5 + i).Value = Item.innertext
i = i + 1
End If
Next Item
I'm thinking...something like this...to start. Of course, you can modify it to suit your needs.
Sub WebImport()
Dim objIE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "https://amazon.in/dp/B01FXJI1OY"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.document.all
If itm.classname = "a-unordered-list a-vertical a-spacing-none" Then
.Range("A" & RowCount) = itm.classname
.Range("B" & RowCount) = itm.innerText
RowCount = RowCount + 1
End If
Next itm
End With
End Sub
Result:

Web Scraping: Button clicking and help navigating through paths

I am trying to scrape some doctor names and addresses from the website: https://albertafindadoctor.ca/find-a-doc/directory
I am trying to solve the following issue:
Once on the doctor's toggle, I want to pull 4 pieces of data from the entire page, not just the first 25 displayed.
While the code works for the initial webpage, it only pulls the first 25 pieces of data. There are a significant number of other pages that I still need to pull (3822 different doctors).
Unfortunately, I'm at a loss on how to navigate and pull from these different pages. When I inspect elements to see how to navigate between pages a see matrix changing so I'm not sure if that has something to do with it?
Option Explicit
Sub GetAlbertaDoctors()
Dim objIE As InternetExplorer
Dim clinicEle As Object
Dim clinicName As String
Dim clinicAddress As String
Dim clinicCategory As String
Dim doctorName As String
Dim y As Integer
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.Navigate "https://albertafindadoctor.ca/find-a-doc/directory"
While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Wend
Application.Wait (Now + TimeValue("0:00:5"))
objIE.Document.getElementsByClassName("physician-toggle")(0).Click
Application.Wait (Now + TimeValue("0:00:5"))
y = 2
For Each clinicEle In objIE.Document.getElementsByClassName("clinic")
clinicCategory = clinicEle.getElementsByClassName("pcn")(0).innerText
clinicName = clinicEle.getElementsByClassName("clinic-name")(0).innerText
doctorName = clinicEle.getElementsByTagName("h3")(0).innerText
clinicAddress = clinicEle.getElementsByClassName("address")(0).innerText
Sheets("Sheet2").Range("A" & y).Value = clinicCategory
Sheets("Sheet2").Range("B" & y).Value = clinicName
Sheets("Sheet2").Range("C" & y).Value = doctorName
Sheets("Sheet2").Range("D" & y).Value = clinicAddress
y = y + 1
Next
objIE.Quit
End Sub
When I run this, I get the error 91 "Object variable or With block variable not set" on the clicking line:
objIE.Document.getElementsByClassName("physician-toggle active")(0).Click
You don't need to loop all pages. You can use the browser to get to that page and click on Doctors if required. After that, grab the number of results and then mimic the xhr request the page makes for listings - which is returned as json. Alter the query string the page makes i.e. the parameter for limit to get all listings. Use a json parser (I use jsonconverter - instructions in the code for installation) to parse out your info.
There is a proper page load wait and a couple of loops to ensure elements are present. These should really be timed loops. See loop format here.
I add an additional test to ensure you do not attempt to click Doctors when it is not required to do so.
Not all listings has all info hence the On Error Resume Next paired with On Error GoTo 0. Looks like you may be able to build a dictionary to fill in some of the blank values based on existing paired values (or using ids present in json object).
I store all results in an array and write out in one go.
Option Explicit
'VBE > Tools > References:
' Microsoft Internet Controls
' Microsoft Scripting Runtime
'Download and add to standard module called jsonconverter from https://github.com/VBA-tools/VBA-JSON/blob/master/JsonConverter.bas
Public Sub GetListings()
Dim ie As InternetExplorer, s As String, json As Object, newUrl As String
Set ie = New InternetExplorer
With ie
.Visible = True
.Navigate2 "https://albertafindadoctor.ca/find-a-doc/directory"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document.querySelector("[data-cp-option=physician]")
If Not .className = "physician-toggle active" Then .Click
End With
Dim resultsInfo() As String, numResults As Long, ele As Object
Do
On Error Resume Next
Set ele = .document.querySelector(".paginator")
On Error GoTo 0
Loop While ele Is Nothing
Do
Loop While .document.querySelector(".paginator").innerText = vbNullString
resultsInfo = Split(Trim$(.document.querySelector(".paginator").innerText), "of ")
.Quit
End With
numResults = resultsInfo(UBound(resultsInfo))
newUrl = "https://albertafindadoctor.ca/search/directory/physicians?page=1&limit=" & numResults & "&with[]=pcn&with[]=clinics&with[]=languages&with[]=specialties"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", newUrl, False
.send
Set json = JsonConverter.ParseJson(.responseText)("items")
End With
Dim row As Object, results(), r As Long, headers(), ws As Worksheet, key As Variant
headers = Array("clinicCategory", "clinicName", "doctorName", "clinicAddress")
Set ws = ThisWorkbook.Worksheets("Sheet1")
ReDim results(1 To json.Count, 1 To UBound(headers) + 1)
For Each row In json
r = r + 1
On Error Resume Next
For Each key In row.keys
Select Case key
Case "clinical_name"
results(r, 3) = row(key)
Case "pcn"
results(r, 1) = row(key)("name")
Case "clinics"
results(r, 2) = row(key)(1)("name")
results(r, 4) = Join$(Array(row(key)(1)("street_address"), row(key)(1)("city"), row(key)(1)("province"), row(key)(1)("postal_code")), ", ")
End Select
Next
On Error GoTo 0
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
End Sub
Sample output:
Reading:
querySelector
json
css selectors
arrays and arrays2

How to scrape data from Bloomberg's website with VBA

Background
Disclaimer: I am a beginner, please bare with my - most plausibly wrong - code.
I want to update currency pairs' value (PREV CLOSE) with a button-enabled-VBA macro. My Excel worksheet contains FX pairs (e.g. USDGBP) on column G:G which are then used to run a FOR loop for every pair in the column.
The value would then be stored in column I:I
Right now, the problem according to the Debugger lies in one line of code that I will highlight below
Sources
I got some inspiration from https://www.youtube.com/watch?v=JxmRjh-S2Ms&t=1050s - notably 17:34 onwards - but I want my code to work for multiple websites at the press of a button.
I have tried the following code
Public Sub Auto_FX_update_BMG()
Application.ScreenUpdating = False 'My computer is not very fast, thus I use this line of
'code to save some computing power and time
Dim internet_object As InternetExplorer
Dim i As Integer
For i = 3 To Sheets(1).Cells(3, 7).End(xlDown).Row
FX_Pair = Sheets(1).Cells(i, 7)
Set internet_object = New InternetExplorer
internet_object.Visible = True
internet_object.navigate "https://www.bloomberg.com/quote/" & FX_Pair & ":CUR"
Application.Wait Now + TimeValue("00:00:05")
internet_object.document.getElementsByClassName("class")(0).getElementsByTagName ("value__b93f12ea") '--> DEBUGGER PROBLEM
'My goal here is to "grab" the PREV CLOSE
'value from the website
With ActiveSheet
.Range(Cells(i, 9)).Value = HTML_element.Children(0).textContent
End With
Sheets(1).Range(Cells(i, 9)).Copy 'Not sure if these 2 lines are unnecesary
ActiveSheet.Paste
Next i
Application.ScreenUpdating = True
End Sub
Expected Result
WHEN I enter "USDGBP" on a cell on column G:G, the macro would go to https://www.bloomberg.com/quote/EURGBP:CUR and "grab" the PREV CLOSE value of 0.8732 (using today's value) and insert it in the respective row of column I:I
As of now, I am just facing the debugger without much idea on how to solve the problem.
You can use class selectors in a loop. The pattern
.previousclosingpriceonetradingdayago .value__b93f12ea
specifies to get child elements with class value__b93f12ea having parent with class previousclosingpriceonetradingdayago. The "." in front is a css class selector and is a faster way of selecting as modern browsers are optimized for css. The space between the two classes is a descendant combinator. querySelector returns the first match for this pattern from the webpage html document.
This matches on the page:
You can see the parent child relationship and classes again here:
<section class="dataBox previousclosingpriceonetradingdayago numeric">
<header class="title__49417cb9"><span>Prev Close</span></header>
<div class="value__b93f12ea">0.8732</div>
</section>
N.B. If you are a Bloomberg customer look into their APIs. Additionally, it is very likely you can get this same info from other dedicated APIs which will allow for much faster and more reliable xhr requests.
VBA (Internet Explorer):
Option Explicit
Public Sub test()
Dim pairs(), ws As Worksheet, i As Long, ie As Object
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set ie = CreateObject("InternetExplorer.Application")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With ie
.Visible = True
For i = LBound(pairs) To UBound(pairs)
.Navigate2 "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
While .Busy Or .readyState < 4: DoEvents: Wend
results(i) = .document.querySelector(".previousclosingpriceonetradingdayago .value__b93f12ea").innerText
Next
.Quit
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
For very limited numbers of requests (as leads to blocking) you could use xhr request and regex out the value. I assume pairs are in sheet one and start from G2. I also assume there are no empty cells or invalid pairs in column G up to an including last pair to search for. Otherwise, you will need to develop the code to handle this.
Try regex here
Option Explicit
Public Sub test()
Dim re As Object, pairs(), ws As Worksheet, i As Long, s As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set re = CreateObject("VBScript.RegExp")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With CreateObject("MSXML2.XMLHTTP")
For i = LBound(pairs) To UBound(pairs)
.Open "GET", "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
.send
s = .responseText
results(i) = GetCloseValue(re, s, "previousClosingPriceOneTradingDayAgo%22%3A(.*?)%2")
Next
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
Public Function GetCloseValue(ByVal re As Object, inputString As String, ByVal pattern As String) As String 'https://regex101.com/r/OAyq30/1
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
If .test(inputString) Then
GetCloseValue = .Execute(inputString)(0).SubMatches(0)
Else
GetCloseValue = "Not found"
End If
End With
End Function
Try below code:
But before make sure to add 2 reference by going to Tools> References > then look for Microsoft HTML Object Library and Microsoft Internet Controls
This code works upon using your example.
Sub getPrevCloseValue()
Dim ie As Object
Dim mySh As Worksheet
Set mySh = ThisWorkbook.Sheets("Sheet1")
Dim colG_Value As String
Dim prev_value As String
For a = 3 To mySh.Range("G" & Rows.Count).End(xlUp).Row
colG_Value = mySh.Range("G" & a).Value
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate "https://www.bloomberg.com/quote/" & colG_Value & ":CUR"
Do While ie.Busy: DoEvents: Loop
Do Until ie.readyState = 4: DoEvents: Loop
'Application.Wait (Now + TimeValue("00:00:03")) 'activate if having problem with delay
For Each sect In ie.document.getElementsByTagName("section")
If sect.className = "dataBox previousclosingpriceonetradingdayago numeric" Then
prev_value = sect.getElementsByTagName("div")(0).innerText
mySh.Range("I" & a).Value = prev_value
Exit For
End If
Next sect
Next a
I have a video tutorial for basic web automation using vba which include web data scraping and other commands, please check the link below:
https://www.youtube.com/watch?v=jejwXID4OH4&t=700s

web scraping using excel and VBA

i wrote my VBA code in excel sheet as below but it is not scrape data for me and also i don't know why please any one help me. it gave me reullt as "click her to read more" onlyi want to scrape enitre data such as first name last name state zip code and so on
Sub extractTablesData()
Dim IE As Object, obj As Object
Dim myState As String
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
myState = InputBox("Enter the city where you wish to work")
With IE
.Visible = True
.navigate ("http://www.funeralhomes.com/go/listing/Search? name=&city=&state=&country=USA&zip=&radius=")
While IE.readyState <> 4
DoEvents
Wend
For Each obj In IE.document.all.item("state").Options
If obj.innerText = myState Then
obj.Selected = True
End If
Next obj
IE.document.getElementsByValue("Search").item.Click
Do While IE.Busy: DoEvents: Loop
ThisWorkbook.Sheets("Sheet1").Range("A1:K1500").ClearContents
Set elemCollection = IE.document.getElementsByTagName("TABLE")
For t = 0 To (elemCollection.Length - 1)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
Set IE = Nothing
End Sub
Using the same URL as the answer already given you could alternatively select with CSS selectors to get the elements of interest, and use split to get just the names and address parts from the text. We can also do away with the browser altogether to get faster results from first results page.
Business name:
You can get the name with the following selector (using paid listing example):
div.paid-listing .listing-title
This selects (sample view)
Try
Address info:
The associated descriptive information can be retrieved with the selector:
div.paid-listing .address-summary
And then using split we can parse this into just the address information.
Code:
Option Explicit
Public Sub GetTitleAndAddress()
Dim oHtml As HTMLDocument, nodeList1 As Object, nodeList2 As Object, i As Long
Const URL As String = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", URL, False
.send
oHtml.body.innerHTML = .responseText
End With
Set nodeList1 = oHtml.querySelectorAll("div.paid-listing .listing-title")
Set nodeList2 = oHtml.querySelectorAll("div.paid-listing .address-summary")
With Worksheets("Sheet3")
.UsedRange.ClearContents
For i = 0 To nodeList1.Length - 1
.Range("A" & i + 1) = nodeList1.Item(i).innerText
.Range("B" & i + 1) = Split(nodeList2.Item(i).innerText, Chr$(10))(0)
Next i
End With
End Sub
Example output:
Yeah, without an API, this can be very tricky at best, and very inconsistent at worst. For now, you can try the script below.
Sub DumpData()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.document.all
If itm.classname Like "*free-listing*" Or itm.classname Like "*paid-listing*" Then
.Range("A" & RowCount) = itm.classname
.Range("B" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
End If
Next itm
End With
End Sub
You probably want some kind of input box to capture the city and state and radius from the user, or capture those variable in cells in your worksheet.
Notice, the '%20' is a space character.
I got this idea from a friend of mine, Joel, a long time ago. That guy is great!

Error Message as Object Required while using IE.Document.getElementById("info_window").innerText

I am very new to VBA programming & have been trying to capture the text that gets popup on the google map when I search using GPS Coordinates (Latitude and Longitude). I have been using the website www.gps-coordinates.net.
Till now I have been able to pass the GPS Coordinates to the website input boxes and got the button clicked and the address pops'up on the google map in a box. While I am trying to get the address from the box using IE.Document.getElementById("info_window").innerText, I get an error message
Runtime Error '424' Object Required
I am not able to understand which object the code is referring to. I have search the internet for a solution but couldn't find anyone facing similar kind of problem, so couldn't find a ready solution for this. The code snippet is mentioned below:
Sub GetPlaceName()
Dim btn As Variant
Dim IE As Object, Doc As Object
Dim sDD1 As String, sDD2 As String
Dim vAdd As Variant
Dim lat As Integer, longt As Integer, latrow As Integer, longrow As Integer
latrow = ActiveSheet.Cells(Rows.Count, "B").End(xlUp).Row - 1
longrow = ActiveSheet.Cells(Rows.Count, "C").End(xlUp).Row - 1
If latrow <> longrow Then
MsgBox "The number of Latitudes & Longitudes are not equal. " & vbNewLine & " PleaseCheck...."
Else
Range("B2").Select
For i = 2 To latrow
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.navigate "http://www.gps-coordinates.net/"
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
Set Doc = IE.Document
Doc.getElementById("latitude").Value = Range("B" & i).Cells.Value
Doc.getElementById("longitude").Value = Range("C" & i).Cells.Value
With Doc
Set elems = .getElementsByTagName("button")
For Each e In elems
If e.getAttribute("onclick") = "codeLatLng(1)" Then
e.Click
Exit For
End If
Next e
End With
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
sDD1 = IE.Document.getElementById("info_window").innerText
vAdd = Split(sDD1, vbNewLine)
Cells(i, "E").Value = vAdd
IE.Quit
Set IE = Nothing
Set Doc = Nothing
Next i
End If
End Sub
I believe that there must be a very small error which I am not able to figure out. It must also appear to be very silly to some of you but I really need help to fix this. Hope to receive reply soon... Thanks in advance :-)
Try using
Set IE = new SHDocVw.InternetExplorer
instead of
Set IE = CreateObject("InternetExplorer.Application")
this never failed to work for me.

Resources