Excel Macro To Pull Google Image Links - excel

The goal is to get images from Google Images that match the part numbers in my database. My code runs, and it pulls up the correct Google pages but refuses to put the links into the spreadsheet. I have tried everything I can think of, but as of now, I keep on getting Error 1004 (Application-defined or Object-defined error).`
Sub SearchBotGoogleImgLink()
Dim objIE As Object
Set IE = CreateObject("InternetExplorer.Application")
Dim HTMLdoc As HTMLDocument
Dim imgElements As IHTMLElementCollection
Dim imgElement As HTMLImg
Dim aElement As HTMLAnchorElement
Dim n As Integer
Dim i As Integer
Dim url As String
Dim url2 As String
Dim m As Long
Dim lastRow As Long
Dim url3 As String
Dim SearchRow As Long
Dim aEle As HTMLLinkElement
Worksheets("Sheet1").Select
SearchRow = 1
Do Until IsEmpty(ActiveSheet.Cells(SearchRow, 1))
Sheets("Sheet1").Select
Application.StatusBar = SearchRow - 1 & " of " & "4368" & " Items Done"
Item = Trim(ActiveSheet.Cells(SearchRow, 1))
url = "https://www.google.com/search?hl=en&biw=1600&bih=796&tbm=isch&sa=1&ei=CTOpW_2jO6nAjwT67rqACw&q=A2N0015C3KUU&oq=" & Cells(SearchRow, 1) & "&oq=A2N0015C3KUU&gs_l=img.12...0.0..1704...0.0..0.0.0.......1......gws-wiz-img.9wB6WwQJhwA"
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate url
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
For Each aEle In objIE.document.getElementsByTagName("IMG")
result = aEle
Sheets("Sheet1").Range(SearchRow & "C").Value = result
Sheets("Sheet1").Range(SearchRow & "D") = aEle.innerHTML
Sheets("Sheet1").Range(SearchRow & "F").Value = aEle.innerText
Debug.Print aEle.innerText
Next
Loop
'For i = 1 To lastRow
'url = "https://www.google.com/search?hl=en&biw=1600&bih=796&tbm=isch&sa=1&ei=CTOpW_2jO6nAjwT67rqACw&q=A2N0015C3KUU&oq=" & Cells(SearchRow, 1) & "&oq=A2N0015C3KUU&gs_l=img.12...0.0..1704...0.0..0.0.0.......1......gws-wiz-img.9wB6WwQJhwA"
Set HTMLdoc = objIE.document
Set imgElements = HTMLdoc.getElementsByTagName("IMG")
n = 1
For Each imgElement In imgElements
If InStr(ingElement.src, sImageSearchString) Then
If imgElement.ParentNode.nodeName = "A" Then
Set aElement = imgElement.ParentNode
If n = 2 Then
url2 = aElement.href 'imgElement.src
url3 = imgElement.src 'aElement.href
n = n + 1
End If
End If
End If
Next
Cells(SearchRow, 5) = url2
IE.Quit
Set IE = Nothing
End Sub

Notes on your code:
You need Option Explicit at the top of your code to check on variable declarations and typos amongst other advantages. There are a number of missing declarations e.g. result, and used ones later e.g. Set IE = CreateObject("InternetExplorer.Application"). You have two different variables (one late bound and one early) both creating IE instances. You only in fact use one.
Your current error may be down to you trying to work with an object here:
result = aEle which won't work without the Set keyword to provide the required reference.
Without example URLs and expected output it is difficult to advise on the later loops in your code. You appear to have a duplicate loop over IMG elements but this time with some restrictions. It is likely these loops can be merged.
An example:
The following uses an arbitrary concatenation in to pull the img src links in from search results based on A2N0015C3KUU.
It uses a CSS selector combination of #ires img[src] to target elements with img tags and src attributes within the parent element with id ires (search results).
It is to demonstrate the principle of gathering aNodeList of matching elements and writing out to a sheet. The querySelectorAll method applied the CSS selector combination to the HTMLDocument and returns the nodeList. The nodeList is looped along its .Length, with items accessed by index starting at 0.
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer
With IE
.Visible = True
.navigate "https://www.google.com/search?hl=en&biw=1600&bih=796&tbm=isch&sa=1&ei=CTOpW_2jO6nAjwT67rqACw&q=A2N0015C3KUU&oq=1&%20%22&oq=A2N0015C3KUU&gs_l=img.12...0.0..1704...0.0..0.0.0.......1......gws-wiz-img.9wB6WwQJhwA"
While .Busy Or .readyState < 4: DoEvents: Wend
Dim aNodeList As Object, i As Long
Set aNodeList = IE.document.querySelectorAll("#ires img[src]")
For i = 0 To aNodeList.Length - 1
ActiveSheet.Cells(i + 2, 4) = aNodeList.item(i).src
Next
'Quit '<== Remember to quit application
End With
End Sub

Related

When the search button is clicked using vba the text entered in search box is not seen by web page

I have written vba code for entering manufacturer part number in search box of below website and clicking on search icon. It is able enter manufacturer part number in search box and click on search icon, but when "search icon is clicked the text entered in the text box is not picked up". It searches empty data.
'HTML Part for search icon
<em class="fa fa-search" aria-hidden="true" style="color: gray;"></em>
It being almost a month I have tried various different way which was also mentioned on stack overflow, like using "createEvent("keyboardevent")" but nothing worked.
' VBA code
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://ecat.aptiv.com"
Do While IE.readyState < READYSTATE_COMPLETE
Loop
Dim idoc As MSHTML.HTMLDocument
Set idoc = IE.document
idoc.getElementById("searchUserInput").Value = "33188785"
Dim doc_ele As MSHTML.IHTMLElement
Dim doc_eles As MSHTML.IHTMLElementCollection
Set doc_eles = idoc.getElementsByTagName("a")
For Each doc_ele In doc_eles
If doc_ele.getAttribute("ng-click") = "SearchButtonClick(1)" Then
doc_ele.Click
Exit Sub
Else
End If
Next doc_ele
End Sub
The page does an xhr request to retrieve the search results. You can find it in the network tab after clicking submit. This means you can avoid, in this case, the expense of a browser and issue an xhr request. The response is json so you do need a json parser to handle the results.
I would use jsonconverter.bas to parse the json. After installing the code from that link in a standard module called JsonConverter, go to VBE > Tools > References > Add a reference to Microsoft Scripting Runtime
I dimension an array to hold the results. I determine rows from the number of items in the json collection returned and the number of columns from the size of the first item dictionary. I loop the json object, and inner loop the dictionary keys of each dictionary in collection, and populate the array. I write the array out in one go at end which is less i/o expensive.
Option Explicit
Public Sub GetInfo()
Dim json As Object, ws As Worksheet, headers()
Dim item As Object, key As Variant, results(), r As Long, c As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://ecat.aptiv.com/json/eCatalogSearch/SearchProducts?filter=All&options=&pageSize=10&search=33188785", False
.send
Set json = JsonConverter.ParseJson(.responseText)("Products")
End With
headers = json.item(1).keys
ReDim results(1 To json.Count, 1 To UBound(headers) + 1)
For Each item In json
r = r + 1: c = 1
For Each key In item.keys
results(r, c) = item(key)
c = c + 1
Next
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
You can do this instead:
txt = "33188785"
IE.navigate "https://ecat.aptiv.com/feature?search=" & txt
This will take you straight to the Search Result.
Code:
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Dim txt As String
Set IE = New InternetExplorer
txt = "33188785"
IE.Visible = True
IE.navigate "https://ecat.aptiv.com/feature?search=" & txt
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
End Sub
This will be faster as You will only have to load one page.
Why that's happening, i am not sure, but seems like the TextBox that is used to input text is not being Activated when adding text automatically to it. It is being activated when we click inside it.
I got the solution for above problem from Mrxel.com below is the link for that post.
https://www.mrexcel.com/forum/excel-questions/1105434-vba-ie-automation-issue-angularjs-input-text-post5317832.html#post5317832
In this case I need to enter the search string character by character and sendKeys and input events inside the loop. Below is the working vba code.
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://ecat.aptiv.com"
Do While IE.readyState < READYSTATE_COMPLETE
Loop
Dim idoc As MSHTML.HTMLDocument
Set idoc = IE.document
IE.document.getElementById("searchUserInput").Focus = True
IE.document.getElementById("searchUserInput").Select
sFieldInput = "33188785"
For s = 1 To Len(sFieldInput)
Application.SendKeys Mid(sFieldInput, s, 1)
While IE.readyState < 4 Or IE.Busy
Application.Wait DateAdd("s", LoopSeconds, Now)
Wend
Next s
IE.document.getElementById("searchUserInput").Focus = False
Dim doc_ele As MSHTML.IHTMLElement
Dim doc_eles As MSHTML.IHTMLElementCollection
Set doc_eles = idoc.getElementsByTagName("a")
For Each doc_ele In doc_eles
If doc_ele.getAttribute("ng-click") = "SearchButtonClick(1)" Then
doc_ele.Click
Exit Sub
Else
End If
Next doc_ele
End Sub

Web Scraping: Button clicking and help navigating through paths

I am trying to scrape some doctor names and addresses from the website: https://albertafindadoctor.ca/find-a-doc/directory
I am trying to solve the following issue:
Once on the doctor's toggle, I want to pull 4 pieces of data from the entire page, not just the first 25 displayed.
While the code works for the initial webpage, it only pulls the first 25 pieces of data. There are a significant number of other pages that I still need to pull (3822 different doctors).
Unfortunately, I'm at a loss on how to navigate and pull from these different pages. When I inspect elements to see how to navigate between pages a see matrix changing so I'm not sure if that has something to do with it?
Option Explicit
Sub GetAlbertaDoctors()
Dim objIE As InternetExplorer
Dim clinicEle As Object
Dim clinicName As String
Dim clinicAddress As String
Dim clinicCategory As String
Dim doctorName As String
Dim y As Integer
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.Navigate "https://albertafindadoctor.ca/find-a-doc/directory"
While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Wend
Application.Wait (Now + TimeValue("0:00:5"))
objIE.Document.getElementsByClassName("physician-toggle")(0).Click
Application.Wait (Now + TimeValue("0:00:5"))
y = 2
For Each clinicEle In objIE.Document.getElementsByClassName("clinic")
clinicCategory = clinicEle.getElementsByClassName("pcn")(0).innerText
clinicName = clinicEle.getElementsByClassName("clinic-name")(0).innerText
doctorName = clinicEle.getElementsByTagName("h3")(0).innerText
clinicAddress = clinicEle.getElementsByClassName("address")(0).innerText
Sheets("Sheet2").Range("A" & y).Value = clinicCategory
Sheets("Sheet2").Range("B" & y).Value = clinicName
Sheets("Sheet2").Range("C" & y).Value = doctorName
Sheets("Sheet2").Range("D" & y).Value = clinicAddress
y = y + 1
Next
objIE.Quit
End Sub
When I run this, I get the error 91 "Object variable or With block variable not set" on the clicking line:
objIE.Document.getElementsByClassName("physician-toggle active")(0).Click
You don't need to loop all pages. You can use the browser to get to that page and click on Doctors if required. After that, grab the number of results and then mimic the xhr request the page makes for listings - which is returned as json. Alter the query string the page makes i.e. the parameter for limit to get all listings. Use a json parser (I use jsonconverter - instructions in the code for installation) to parse out your info.
There is a proper page load wait and a couple of loops to ensure elements are present. These should really be timed loops. See loop format here.
I add an additional test to ensure you do not attempt to click Doctors when it is not required to do so.
Not all listings has all info hence the On Error Resume Next paired with On Error GoTo 0. Looks like you may be able to build a dictionary to fill in some of the blank values based on existing paired values (or using ids present in json object).
I store all results in an array and write out in one go.
Option Explicit
'VBE > Tools > References:
' Microsoft Internet Controls
' Microsoft Scripting Runtime
'Download and add to standard module called jsonconverter from https://github.com/VBA-tools/VBA-JSON/blob/master/JsonConverter.bas
Public Sub GetListings()
Dim ie As InternetExplorer, s As String, json As Object, newUrl As String
Set ie = New InternetExplorer
With ie
.Visible = True
.Navigate2 "https://albertafindadoctor.ca/find-a-doc/directory"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document.querySelector("[data-cp-option=physician]")
If Not .className = "physician-toggle active" Then .Click
End With
Dim resultsInfo() As String, numResults As Long, ele As Object
Do
On Error Resume Next
Set ele = .document.querySelector(".paginator")
On Error GoTo 0
Loop While ele Is Nothing
Do
Loop While .document.querySelector(".paginator").innerText = vbNullString
resultsInfo = Split(Trim$(.document.querySelector(".paginator").innerText), "of ")
.Quit
End With
numResults = resultsInfo(UBound(resultsInfo))
newUrl = "https://albertafindadoctor.ca/search/directory/physicians?page=1&limit=" & numResults & "&with[]=pcn&with[]=clinics&with[]=languages&with[]=specialties"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", newUrl, False
.send
Set json = JsonConverter.ParseJson(.responseText)("items")
End With
Dim row As Object, results(), r As Long, headers(), ws As Worksheet, key As Variant
headers = Array("clinicCategory", "clinicName", "doctorName", "clinicAddress")
Set ws = ThisWorkbook.Worksheets("Sheet1")
ReDim results(1 To json.Count, 1 To UBound(headers) + 1)
For Each row In json
r = r + 1
On Error Resume Next
For Each key In row.keys
Select Case key
Case "clinical_name"
results(r, 3) = row(key)
Case "pcn"
results(r, 1) = row(key)("name")
Case "clinics"
results(r, 2) = row(key)(1)("name")
results(r, 4) = Join$(Array(row(key)(1)("street_address"), row(key)(1)("city"), row(key)(1)("province"), row(key)(1)("postal_code")), ", ")
End Select
Next
On Error GoTo 0
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
End Sub
Sample output:
Reading:
querySelector
json
css selectors
arrays and arrays2

eBay Product scraper

I am very limited on VBA,
The Code is in a Module, the code has a sub process as well, so sorry if I post the code wrong
A) open IE
B) Subprocess gets the data.
The code works fine on ebay.com but NOT for ebay.co.uk - can't work out why, also it converts urls to hyperlinks
It only does the first page, I need it to go through an X amount of pages - have a code but can't get it to work so have removed it.
Can the search query be run AFTER Ebay opens, so it opens, then search item is input to ebay and then code runs, or to run from a cell, IF its Cell A1 the data extracted needs to be pasted in A2 and below.
I have looked at elements for ebay.com and ebay.co.uk and they look the same to me, so can't work out why its not working as it works for 1 and not the other.
I did input the code for getting data from several pages it did not work. I know this code works as I have it for when I fetch urls from google
Public IE As New SHDocVw.InternetExplorer
Sub GetData()
Dim HTMLdoc As MSHTml.HTMLDocument
Dim othwb As Variant
Dim objShellWindows As New SHDocVw.ShellWindows
Set IE = CreateObject("internetexplorer.application")
With IE
.Visible = True
'.Navigate "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=jackets&_sacat=0"
.Navigate "https://www.ebay.com/sch/i.html_from=R40&_nkw=ralph+lauren&_sacat=1059&LH_TitleDesc=0&_dmd=1&rt=nc"
While .Busy Or .readyState <> 4: DoEvents: Wend
Set HTMLdoc = IE.document
ProcessHTMLPage HTMLdoc
.Quit
End With
End Sub
code here
enter
'''''' THIS IS THE SUB PROCESS '''''
Sub ProcessHTMLPage(HTMLPage As MSHTml.HTMLDocument)
Dim HTMLItem As MSHTml.IHTMLElement
Dim HTMLItems As MSHTml.IHTMLElementCollection
Dim HTMLInput As MSHTml.IHTMLElement
Dim rownum As Long
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__title")
For Each HTMLItem In HTMLItems
Cells(rownum, 1).Value = HTMLItem.innerText
rownum = rownum + 1
Next HTMLItem
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__price")
For Each HTMLItem In HTMLItems
Cells(rownum, 2).Value = HTMLItem.innerText
rownum = rownum + 1
Next HTMLItem
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__link")
For Each HTMLItem In HTMLItems
Cells(rownum, 3).Value = HTMLItem.href
rownum = rownum + 1
Next HTMLItem
'Converts each text hyperlink selected into a working hyperlink from C1 to 25000 rows
Range("C1:C25000").Select
For Each xCell In Selection
ActiveSheet.Hyperlinks.Add Anchor:=xCell, Address:=xCell.Formula
Next xCell
Range("C1").Select
End Sub
Code for going to next page
pageNumber = 1
'i = 2
If pageNumber >= 6 Then Exit Do 'the first 6 pages
internetdata.getElementById("pnnext").click 'next web page
Do While internet.Busy Or internet.readyState <> 4
DoEvents
Loop
Set internetdata = internet.document
pageNumber = pageNumber + 1
Loop
Does not work on Ebay.co.uk - NO RESULTS ARE EXTRACTED - Works fine in ebay.com
Need it to get data from X amount of pages and NOT just 1 page
Can the search query be run AFTER Ebay opens, so it opens, then search item is input to ebay and then code runs, or to run from a cell, IF its Cell A1 the data extracted needs to be pasted in A2 and below.
This is my code for google search, I have got it working so the search comes from cell A1, I am look for something like this, I am going to see if I can use the ebay code with this. As this also does the first 25 pages in google search
enter Sub webpage()
Dim ie As Object
Dim htmlDoc As Object
Dim nextPageElement As Object
Dim div As Object
Dim link As Object
Dim url As String
Dim pageNumber As Long
Dim i As Long
' Takes seach from A1 and places it into google
url = "https://www.google.co.uk/search?q=" & Replace(Worksheets("Sheet1").Range("A1").Value, " ", "+")
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate url
Do While .Busy Or .readyState <> 4
DoEvents
Loop
End With
Application.Wait Now + TimeSerial(0, 0, 5)
Set htmlDoc = ie.document
pageNumber = 1
i = 2
Do
For Each div In htmlDoc.getElementsByTagName("div")
If div.getAttribute("class") = "r" Then
Set link = div.getElementsByTagName("a")(0)
Cells(i, 2).Value = link.getAttribute("href")
i = i + 1
End If
Next div
If pageNumber >= 25 Then Exit Do 'the first 25 pages
Set nextPageElement = htmlDoc.getElementById("pnnext")
If nextPageElement Is Nothing Then Exit Do
' Clicks web next page
nextPageElement.Click 'next web page
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
Application.Wait Now + TimeSerial(0, 0, 5)
Set htmlDoc = ie.document
pageNumber = pageNumber + 1
Loop
MsgBox "All Done"
Set ie = Nothing
Set htmlDoc = Nothing
Set nextPageElement = Nothing
Set div = Nothing
Set link = Nothing
End Sub
code here
Question 1: Why does it work for one domain but not the other?
To answer question 1 (the other questions should be new posts) - the html is not the same at all. The classes which work for ebay.com are not found in ebay.co.uk; So, your loop over collections doesn't do anything because they are count 0 (or length 0 with nodeLists if using querySelectorAll). Instead, you need branched code. Set your selectors based on the url domain.
I have used css selectors as this is the easiest, and fastest way, to select the required elements whilst maintaining the flexibility of a code re-factor to reduce the lines of repeated code.
Side note:
If you are unsure about whether your selection method will work across different pages you can do at least two things:
Right click > inspect element > visually check the class names are the same for the elements you are attempting to compare. So, if you are looking at product names, are the class names in the html the same on both pages?
You can use the search facility of the browser > open element tab via F12 then press Ctrl+F to pull up search box > enter your class name from the first page into this box in the second page and hit enter. You can also enter css selectors here and some cases regex. You will get a hit count telling you how many matches found. You can keep pressing enter to cycle through matches and each match will be highlighted in the html above, so you can easily compare if matched results are what you expected.
click image to enlarge
img url: https://i.stack.imgur.com/MWkEx.png
VBA:
Option Explicit
Public Sub GetData()
Dim htmlDoc As MSHTML.HTMLDocument, ie As SHDocVw.InternetExplorer, ws As Worksheet
Set ie = New SHDocVw.InternetExplorer
Set htmlDoc = New MSHTML.HTMLDocument
Set ws = ThisWorkbook.Worksheets("Sheet1")
With ie
.Visible = True
'.Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=jackets&_sacat=0"
.Navigate2 "https://www.ebay.com/sch/i.html_from=R40&_nkw=ralph+lauren&_sacat=1059&LH_TitleDesc=0&_dmd=1&rt=nc"
While .Busy Or .readyState <> 4: DoEvents: Wend
Dim index As Long, HTMLItems As Object, rowNum As Long, xCell As Range
Dim cssSelectors(), i As Long
Select Case True
Case InStr(.document.URL, "ebay.co.uk") > 0
cssSelectors = Array(".gvtitle a", ".amt", ".gvtitle a")
Case InStr(.document.URL, "ebay.com") > 0
cssSelectors = Array(".s-item__title", ".s-item__price", ".s-item__link")
End Select
With ws
For i = LBound(cssSelectors) To UBound(cssSelectors)
rowNum = 1
Set HTMLItems = ie.document.querySelectorAll(cssSelectors(i))
For index = 0 To HTMLItems.length - 1
.Cells(rowNum, i + 1).Value = IIf(i = 2, HTMLItems.item(index).getAttribute("href"), HTMLItems.item(index).innerText)
rowNum = rowNum + 1
Next
Next
For Each xCell In .Range("C1:C25000") '<= all these really?
.Hyperlinks.Add Anchor:=xCell, Address:=xCell.Formula
Next xCell
End With
.Quit
End With
End Sub
If this works on eBay then you need to find out yourself why it's not working on ebay.co.uk. My point is if the code itself works than there is nothing we can help you with here. You need to take some time to investigate ebay.co.uk and find the differences as I am sure it's something minor. I can't help u fix code that isn't actually broken. I wish you luck though.

How to scrape data from Bloomberg's website with VBA

Background
Disclaimer: I am a beginner, please bare with my - most plausibly wrong - code.
I want to update currency pairs' value (PREV CLOSE) with a button-enabled-VBA macro. My Excel worksheet contains FX pairs (e.g. USDGBP) on column G:G which are then used to run a FOR loop for every pair in the column.
The value would then be stored in column I:I
Right now, the problem according to the Debugger lies in one line of code that I will highlight below
Sources
I got some inspiration from https://www.youtube.com/watch?v=JxmRjh-S2Ms&t=1050s - notably 17:34 onwards - but I want my code to work for multiple websites at the press of a button.
I have tried the following code
Public Sub Auto_FX_update_BMG()
Application.ScreenUpdating = False 'My computer is not very fast, thus I use this line of
'code to save some computing power and time
Dim internet_object As InternetExplorer
Dim i As Integer
For i = 3 To Sheets(1).Cells(3, 7).End(xlDown).Row
FX_Pair = Sheets(1).Cells(i, 7)
Set internet_object = New InternetExplorer
internet_object.Visible = True
internet_object.navigate "https://www.bloomberg.com/quote/" & FX_Pair & ":CUR"
Application.Wait Now + TimeValue("00:00:05")
internet_object.document.getElementsByClassName("class")(0).getElementsByTagName ("value__b93f12ea") '--> DEBUGGER PROBLEM
'My goal here is to "grab" the PREV CLOSE
'value from the website
With ActiveSheet
.Range(Cells(i, 9)).Value = HTML_element.Children(0).textContent
End With
Sheets(1).Range(Cells(i, 9)).Copy 'Not sure if these 2 lines are unnecesary
ActiveSheet.Paste
Next i
Application.ScreenUpdating = True
End Sub
Expected Result
WHEN I enter "USDGBP" on a cell on column G:G, the macro would go to https://www.bloomberg.com/quote/EURGBP:CUR and "grab" the PREV CLOSE value of 0.8732 (using today's value) and insert it in the respective row of column I:I
As of now, I am just facing the debugger without much idea on how to solve the problem.
You can use class selectors in a loop. The pattern
.previousclosingpriceonetradingdayago .value__b93f12ea
specifies to get child elements with class value__b93f12ea having parent with class previousclosingpriceonetradingdayago. The "." in front is a css class selector and is a faster way of selecting as modern browsers are optimized for css. The space between the two classes is a descendant combinator. querySelector returns the first match for this pattern from the webpage html document.
This matches on the page:
You can see the parent child relationship and classes again here:
<section class="dataBox previousclosingpriceonetradingdayago numeric">
<header class="title__49417cb9"><span>Prev Close</span></header>
<div class="value__b93f12ea">0.8732</div>
</section>
N.B. If you are a Bloomberg customer look into their APIs. Additionally, it is very likely you can get this same info from other dedicated APIs which will allow for much faster and more reliable xhr requests.
VBA (Internet Explorer):
Option Explicit
Public Sub test()
Dim pairs(), ws As Worksheet, i As Long, ie As Object
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set ie = CreateObject("InternetExplorer.Application")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With ie
.Visible = True
For i = LBound(pairs) To UBound(pairs)
.Navigate2 "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
While .Busy Or .readyState < 4: DoEvents: Wend
results(i) = .document.querySelector(".previousclosingpriceonetradingdayago .value__b93f12ea").innerText
Next
.Quit
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
For very limited numbers of requests (as leads to blocking) you could use xhr request and regex out the value. I assume pairs are in sheet one and start from G2. I also assume there are no empty cells or invalid pairs in column G up to an including last pair to search for. Otherwise, you will need to develop the code to handle this.
Try regex here
Option Explicit
Public Sub test()
Dim re As Object, pairs(), ws As Worksheet, i As Long, s As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set re = CreateObject("VBScript.RegExp")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With CreateObject("MSXML2.XMLHTTP")
For i = LBound(pairs) To UBound(pairs)
.Open "GET", "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
.send
s = .responseText
results(i) = GetCloseValue(re, s, "previousClosingPriceOneTradingDayAgo%22%3A(.*?)%2")
Next
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
Public Function GetCloseValue(ByVal re As Object, inputString As String, ByVal pattern As String) As String 'https://regex101.com/r/OAyq30/1
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
If .test(inputString) Then
GetCloseValue = .Execute(inputString)(0).SubMatches(0)
Else
GetCloseValue = "Not found"
End If
End With
End Function
Try below code:
But before make sure to add 2 reference by going to Tools> References > then look for Microsoft HTML Object Library and Microsoft Internet Controls
This code works upon using your example.
Sub getPrevCloseValue()
Dim ie As Object
Dim mySh As Worksheet
Set mySh = ThisWorkbook.Sheets("Sheet1")
Dim colG_Value As String
Dim prev_value As String
For a = 3 To mySh.Range("G" & Rows.Count).End(xlUp).Row
colG_Value = mySh.Range("G" & a).Value
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate "https://www.bloomberg.com/quote/" & colG_Value & ":CUR"
Do While ie.Busy: DoEvents: Loop
Do Until ie.readyState = 4: DoEvents: Loop
'Application.Wait (Now + TimeValue("00:00:03")) 'activate if having problem with delay
For Each sect In ie.document.getElementsByTagName("section")
If sect.className = "dataBox previousclosingpriceonetradingdayago numeric" Then
prev_value = sect.getElementsByTagName("div")(0).innerText
mySh.Range("I" & a).Value = prev_value
Exit For
End If
Next sect
Next a
I have a video tutorial for basic web automation using vba which include web data scraping and other commands, please check the link below:
https://www.youtube.com/watch?v=jejwXID4OH4&t=700s

web scraping using excel and VBA

i wrote my VBA code in excel sheet as below but it is not scrape data for me and also i don't know why please any one help me. it gave me reullt as "click her to read more" onlyi want to scrape enitre data such as first name last name state zip code and so on
Sub extractTablesData()
Dim IE As Object, obj As Object
Dim myState As String
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
myState = InputBox("Enter the city where you wish to work")
With IE
.Visible = True
.navigate ("http://www.funeralhomes.com/go/listing/Search? name=&city=&state=&country=USA&zip=&radius=")
While IE.readyState <> 4
DoEvents
Wend
For Each obj In IE.document.all.item("state").Options
If obj.innerText = myState Then
obj.Selected = True
End If
Next obj
IE.document.getElementsByValue("Search").item.Click
Do While IE.Busy: DoEvents: Loop
ThisWorkbook.Sheets("Sheet1").Range("A1:K1500").ClearContents
Set elemCollection = IE.document.getElementsByTagName("TABLE")
For t = 0 To (elemCollection.Length - 1)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
Set IE = Nothing
End Sub
Using the same URL as the answer already given you could alternatively select with CSS selectors to get the elements of interest, and use split to get just the names and address parts from the text. We can also do away with the browser altogether to get faster results from first results page.
Business name:
You can get the name with the following selector (using paid listing example):
div.paid-listing .listing-title
This selects (sample view)
Try
Address info:
The associated descriptive information can be retrieved with the selector:
div.paid-listing .address-summary
And then using split we can parse this into just the address information.
Code:
Option Explicit
Public Sub GetTitleAndAddress()
Dim oHtml As HTMLDocument, nodeList1 As Object, nodeList2 As Object, i As Long
Const URL As String = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", URL, False
.send
oHtml.body.innerHTML = .responseText
End With
Set nodeList1 = oHtml.querySelectorAll("div.paid-listing .listing-title")
Set nodeList2 = oHtml.querySelectorAll("div.paid-listing .address-summary")
With Worksheets("Sheet3")
.UsedRange.ClearContents
For i = 0 To nodeList1.Length - 1
.Range("A" & i + 1) = nodeList1.Item(i).innerText
.Range("B" & i + 1) = Split(nodeList2.Item(i).innerText, Chr$(10))(0)
Next i
End With
End Sub
Example output:
Yeah, without an API, this can be very tricky at best, and very inconsistent at worst. For now, you can try the script below.
Sub DumpData()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.document.all
If itm.classname Like "*free-listing*" Or itm.classname Like "*paid-listing*" Then
.Range("A" & RowCount) = itm.classname
.Range("B" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
End If
Next itm
End With
End Sub
You probably want some kind of input box to capture the city and state and radius from the user, or capture those variable in cells in your worksheet.
Notice, the '%20' is a space character.
I got this idea from a friend of mine, Joel, a long time ago. That guy is great!

Resources