<readystate> doesn't work: Table cells in a HTML.Document don't seem to fully update after clicking an object (thus changing table; URL stays same) - excel

All that I know now about VBA is mostly thanks to this site - you are the best <3
However, for last two days I've been looking for a solution and couldn't work it out by usual research.
Background:
I have a site with a table in it. Table represents prices for product X. There are many products, but table shows only active product. When I click on Y/Z/M, the URL stays same, only table itself updates. I need to build a database for many products, so I loop through the list by looking for elements and clicking on them and then try to fish rows/cells/etc.
Problem:
For same code, same list, same data I end up with non identical databases. Some entries are missing, some entries repeat prices of previous products etc.
Theory what's not working:
After thorough F8-testing, I think this is not working properly:
Do Until Not appIE.Busy And appIE.READYSTATE = 4
Application.Wait (Now + TimeValue("0:00:02"))
Loop
Application.Wait (Now + TimeValue("0:00:05"))
When I go through my macro by F8, all data is being fished properly. It looks like somehow the VBA doesn't wait for the html.document to fully update.
What I'de done so far:
I played around with various configurations (Do While, Do Until Not etc) of the usual loop.
I added the site to "trusted" sites in IE, as someone somewhere suggested it would help. Got scary errors afterwords and I had no idea what to do with them, so "untrusted" the site. No more error messages.
I tried to "reset" the HTML.Document.
I figured out where in the page the actual info about status bar on screen shows up and asked VBA to make sure that name changes from "Modal In" to "Modal":
Set checkA = html.getElementById("processingModal")
Dim trytry As String
Do While trytry = "modal in"
trytry = checkA.className
Application.Wait (Now + TimeValue("0:00:01"))
Loop
I still end up with messed up output.
Below, I'm adding my full code. I would also appreciate greatly any advice, as this is my first web scraping code ever and I'm self taught.
Many thanks in advance, Good People!
Sub try_this()
'trying scraping from web
Dim appIE As Object
Dim html As HTMLDocument
Dim lngRow, i, lngColumn, lngYear, a, s As Long
Dim tblSummary As IHTMLTable
Dim tblRows As IHTMLElementCollection
Dim tblRow As IHTMLElement
Dim tblCells As IHTMLElementCollection
Dim tblCell As IHTMLElement
Dim tblDataValue As String
Dim VintagesList As IHTMLElement
Dim Vintages As IHTMLElementCollection
Dim Vintage As IHTMLElement
Dim VintageYear As String
Dim BtlSizesList As IHTMLElement
Dim BtlSizes As IHTMLElementCollection
Dim BtlSize As IHTMLElement
Dim BtlSizeValue As String
Dim btlSizeID As String
Dim objA As IHTMLElement, checkA As IHTMLElement
Dim strAddress As String, strVintageY As String
Dim StartTime As Double
Dim SecondsElapsed As Double
StartTime = Timer
Application.ScreenUpdating = False
'part 1: open IE browser and go to page with products
Set appIE = CreateObject("internetexplorer.application")
With appIE
.Navigate "My web Page"
.Visible = True
End With
Do While appIE.Busy 'so far works all right
DoEvents
Loop
Set html = appIE.Document
For lngYear = 2 To 16 'product category list
Application.StatusBar = "Downloading data for year " & lngYear - 1 & " of 15..."
strVintageY = Sheets("Dict").Range("A" & lngYear).Value 'first cathegory
strAddress = Sheets("Dict").Range("D2").Value & strVintageY & Sheets("Dict").Range("D4").Value & strVintageY 'changes physical address in a browser
appIE.Navigate strAddress
appIE.Visible = True
Do While appIE.Busy
Application.Wait (Now + TimeValue("0:00:02")) ' aaand wait some more, because the trick doesn't work and I'm desperate
Loop
Application.Wait (Now + TimeValue("0:00:05"))
Set html = AppIE.Document 'this is the EDIT part
'STEP 2: get available product names
Set BtlSizesList = html.getElementById("auction-size-tabs")
Set BtlSizes = BtlSizesList.Children
i = 2 'i=2, we'll start to print data into row 2
Sheets("Dict").Range("B2:B100").Clear
For Each BtlSize In BtlSizes
BtlSizeValue = BtlSize.innerText
Sheets("Dict").Cells(i, 2).Value = BtlSizeValue
i = i + 1
Next
'Step 2b: Fish Prices Data Table
lngRow = Sheets("Database").Range("D" & Rows.Count).End(xlUp).Row
s = Sheets("Dict").Range("B" & Rows.Count).End(xlUp).Row 's = last row with product's name
For a = 2 To s
btlSizeID = Sheets("Dict").Range("B" & a).Value
Set objA = html.getElementById(btlSizeID).getElementsByTagName("a")(0) 'click right product on the web page
objA.Click
'Readystate and waiting:
Do Until Not appIE.Busy And appIE.READYSTATE = 4 'wait for page to load
Application.Wait (Now + TimeValue("0:00:02"))
Loop
Application.Wait (Now + TimeValue("0:00:05"))
'this is part where I physically check if the load bar is still there:
Set checkA = html.getElementById("processingModal")
Dim trytry As String
Do While trytry = "modal in"
trytry = checkA.className
Application.Wait (Now + TimeValue("0:00:01"))
Loop
'I even tried to "reset" html.document. To be honest no idea what I'm doing here.
Set html = Nothing
Set html = appIE.Document
Set tblSummary = html.getElementById("summaryTable") 'find the table
Set tblRows = tblSummary.Rows 'get list of rows in the table
For Each tblRow In tblRows
Set tblCells = tblRow.Cells
If lngRow >= 2 Then
Sheets("Database").Range("B" & lngRow).Value = btlSizeID 'condition will be removed once I figure how to skip 1st row of table in HTML
End If
lngColumn = 3
For Each tblCell In tblCells
tblDataValue = tblCell.innerText
lngColumn = lngColumn + 1
Next
If lngRow >= 2 Then
Sheets("Database").Cells(lngRow, 1).Value = strVintageY 'paste product's category into column A
End If
lngRow = lngRow + 1 'i will be row# that will have new info pasted in it
Next
Next a
Application.ScreenUpdating = True
Next lngYear
Set html = Nothing
Set appIE = Nothing
SecondsElapsed = Round(Timer - StartTime, 2)
Application.ScreenUpdating = True
Application.StatusBar = False
MsgBox "This code ran successfully in " & SecondsElapsed & " seconds", vbInformation
End Sub
Update: As suggested, I added missing line that re-sets HTML.Documnet after .Navigate event. This didn't fix my problem though.

Related

Bing scraper navigation

My code all of a sudden does not seem to want to work and I can not work out why. Most of it is fine, there are only two issue with it now and I can not work them out. I was trying to improve it and may have moved someting around and can not work out what I have done. I need a some one to look at this for me please. I have highlighted the issue in the code that are the problem. The bulk of this is fine and I am happy with it.
It no longer clicks on the next page in Bing
It Loops minus 2, so if I put in 10 loops then it does 8. Maybe I have put the loop counter in the wrong place, hence it could be showing wrong results
Its not the best code in the world, but it was something I wrote and it worked fine up until a few weeks back. I have been trying to fix it but can not work it out so decided to post.
What I have done so Far.
I tried to google the answer
I tried to fix it myself. I managed to fix other smaller bugs but can't fix navigation
I Checked the BING CLASS for next page, looks the same to me. I have always used the one in yellow
Private Sub BingScraper()
'''Bing URL SCRAPER
Dim ie As Object
Dim HTMLdoc As Object
Dim nextPageElements As Object
Dim li As Object
Dim link As Object
Dim url As String
Dim pageNumber As Long
Dim i As Long
Dim myCounter As Long
'''Takes seach from Sheet10 to google
url = "https://www.Bing.com/search?q=" & Replace(Worksheets("Sheet10").Range("G17").Value & Range("H17").Value, " ", "+")
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate url
Do While .Busy Or .readyState <> 4
DoEvents
Loop
End With
Application.Wait Now + TimeSerial(0, 0, 5)
Set HTMLdoc = ie.document
'''Searches URLS and places them in Sheet called Sheet2 ROW 2 Column A
With Sheets("Sheet2")
pageNumber = 2
'i = 2
i = Sheets("Sheet2").Cells(Rows.Count, "A").End(xlUp).row + 1
Do
For Each li In HTMLdoc.getElementsByTagName("li")
' Application.ScreenUpdating = False
If li.getAttribute("class") = "b_algo" Then
Set link = li.getElementsByTagName("a")(0)
.Cells(i, 1).Value = link.getAttribute("href")
i = i + 1
End If
Next li
'#####################################################################################
'################################# ISSUE STARTS FROM HERE ############################
'''Searches Number of Pages entered in Sheet10 i17
If pageNumber >= Replace(Worksheets("Sheet10").Range("I17").Value, " ", "+") Then Exit Do
On Error Resume Next
'#################### THIS IS THE CLASS FOR BING NEXT PAGE ################
Set nextPageElements = HTMLdoc.getElementsByClassName("sb_pagN sb_pagN_bp b_widePag sb_bp")(0)
If nextPageElements Is Nothing Then Exit Do
'''Scrolls Down the Browser
ie.document.parentWindow.Scroll 0&, 99999
'''Random delay from Max number entered in Sheet10
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet10").Range("J17").Value))
'######################## NO LONGER GOING TO NEXT PAGE ######################
nextPageElement.Click 'next web page
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
'''Random delay from Max number entered in Sheet10
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet10").Range("K17").Value))
Set HTMLdoc = ie.document
''' Delete duplicates
Sheet2.Columns("A").RemoveDuplicates Columns:=Array(1), Header:=xlYes
pageNumber = pageNumber + 1
'######################### LOOP COUNTER ######################
myCounter = myCounter + 1
Worksheets("Sheet10").Range("G6").Value = myCounter
Loop
''' Delete Row If Blank
Sheet2.Columns("A").SpecialCells(xlCellTypeBlanks).EntireRow.Delete
End With
'################################# ISSUE END HERE ############################
'#############################################################################
If Sheet10.Range("I17") = 0 Then
Complete.Show
Termination.Hide
ElseIf Sheet10.Range("I17") > 0 Then
Complete.Show
End If
ie.Quit
Set ie = Nothing
Set HTMLdoc = Nothing
Set nextPageElements = Nothing
Set li = Nothing
Set link = Nothing
End Sub

Excel VBA Page Pagination, FOR LOOP not working

I am using a For Loop to extract some data from a site, with some code for page navigation through the pages. The problem I am having is that the that after the code navigates to the next page the For Loop is not extracting anything.
What works
The FOR LOOP works fine on the first page
The page pagination code works fine on navigating to the next page/s.
The Problem
*
Once the code has navigated to the next page the FOR LOOP is NOT
working, it will extract NO data.
I cannot work it out
This is what I am working with minus my code as it was a bit too long
Private Sub CommandButton4_Click()
'dimension (declare or set aside memory for) our variables
Dim Html As HTMLDocument
Dim objIE As Object
Dim y As Integer 'integer variable we'll use as a counter
Dim result As String 'string variable that will hold our result link
Dim pageNumber As Long ' page no.
Dim nextPageElement As Object 'page element
Dim lastrow As Long
Dim HtmlText As Variant ' for html data
Dim wsSheet As Worksheet ' WorkSheet
Dim wb As Workbook
Set wb = ThisWorkbook
Set wsSheet = wb.Sheets("Sheet3")
'+++++ Internet Explorer ++++++
Set objIE = New InternetExplorer 'initiating a new instance of Internet Explorer and asigning it to objIE
objIE.Visible = True 'make IE browser visible (False would allow IE to run in the background)
objIE.navigate Sheets("Sheet3").Range("A2").Value & Replace(Worksheets("Sheet3").Range("B2") & Range("C2").Value, " ", "+") 'navigate IE to this web page
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop 'wait here a few seconds while the browser is busy
y = 2
Set Html = objIE.document
Set elements = Html.getElementsByClassName("sresult lvresult clearfix li shic") ' parent CLASS
'FOR LOOP
For Each element In elements
DoEvents
'############################ My code here #################
'COUNTER AND NEXT ELEMENT
y = y + 1
Next element
' #################### START OF Page Navigation will delay ###############
Do
'Number Of Pages to navigate come from sheet3 - D2
If pageNumber >= Replace(Worksheets("Sheet3").Range("D2").Value, " ", "+") Then Exit Do
Set nextPageElement = Html.getElementsByClassName("gspr next")(0)
If nextPageElement Is Nothing Then Exit Do
'Random delay from Max number entered in Sheet3 E2
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet3").Range("E2").Value))
nextPageElement.Click 'next web page
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'SECOND Random delay from Max number entered in Sheet4 f2
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet3").Range("F2").Value))
Set Html = objIE.document
pageNumber = pageNumber + 1
Loop
' ########################## End of Page Navigation with Delays ###############
objIE.Quit ' end and clear browser
Set ie = Nothing
Set htmlDoc = Nothing
Set nextPageElement = Nothing
Set div = Nothing
Set link = Nothing
MsgBox "All Done"
End Sub
I tried to add this to the end of "pageNumber = pageNumber + 1" to see if it would work but got not luck
pageNumber = pageNumber + 1
Set elements = Html.getElementsByClassName("sresult lvresult clearfix li shic")
For Each element In elements
DoEvents
Next element

eBay Product scraper

I am very limited on VBA,
The Code is in a Module, the code has a sub process as well, so sorry if I post the code wrong
A) open IE
B) Subprocess gets the data.
The code works fine on ebay.com but NOT for ebay.co.uk - can't work out why, also it converts urls to hyperlinks
It only does the first page, I need it to go through an X amount of pages - have a code but can't get it to work so have removed it.
Can the search query be run AFTER Ebay opens, so it opens, then search item is input to ebay and then code runs, or to run from a cell, IF its Cell A1 the data extracted needs to be pasted in A2 and below.
I have looked at elements for ebay.com and ebay.co.uk and they look the same to me, so can't work out why its not working as it works for 1 and not the other.
I did input the code for getting data from several pages it did not work. I know this code works as I have it for when I fetch urls from google
Public IE As New SHDocVw.InternetExplorer
Sub GetData()
Dim HTMLdoc As MSHTml.HTMLDocument
Dim othwb As Variant
Dim objShellWindows As New SHDocVw.ShellWindows
Set IE = CreateObject("internetexplorer.application")
With IE
.Visible = True
'.Navigate "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=jackets&_sacat=0"
.Navigate "https://www.ebay.com/sch/i.html_from=R40&_nkw=ralph+lauren&_sacat=1059&LH_TitleDesc=0&_dmd=1&rt=nc"
While .Busy Or .readyState <> 4: DoEvents: Wend
Set HTMLdoc = IE.document
ProcessHTMLPage HTMLdoc
.Quit
End With
End Sub
code here
enter
'''''' THIS IS THE SUB PROCESS '''''
Sub ProcessHTMLPage(HTMLPage As MSHTml.HTMLDocument)
Dim HTMLItem As MSHTml.IHTMLElement
Dim HTMLItems As MSHTml.IHTMLElementCollection
Dim HTMLInput As MSHTml.IHTMLElement
Dim rownum As Long
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__title")
For Each HTMLItem In HTMLItems
Cells(rownum, 1).Value = HTMLItem.innerText
rownum = rownum + 1
Next HTMLItem
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__price")
For Each HTMLItem In HTMLItems
Cells(rownum, 2).Value = HTMLItem.innerText
rownum = rownum + 1
Next HTMLItem
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__link")
For Each HTMLItem In HTMLItems
Cells(rownum, 3).Value = HTMLItem.href
rownum = rownum + 1
Next HTMLItem
'Converts each text hyperlink selected into a working hyperlink from C1 to 25000 rows
Range("C1:C25000").Select
For Each xCell In Selection
ActiveSheet.Hyperlinks.Add Anchor:=xCell, Address:=xCell.Formula
Next xCell
Range("C1").Select
End Sub
Code for going to next page
pageNumber = 1
'i = 2
If pageNumber >= 6 Then Exit Do 'the first 6 pages
internetdata.getElementById("pnnext").click 'next web page
Do While internet.Busy Or internet.readyState <> 4
DoEvents
Loop
Set internetdata = internet.document
pageNumber = pageNumber + 1
Loop
Does not work on Ebay.co.uk - NO RESULTS ARE EXTRACTED - Works fine in ebay.com
Need it to get data from X amount of pages and NOT just 1 page
Can the search query be run AFTER Ebay opens, so it opens, then search item is input to ebay and then code runs, or to run from a cell, IF its Cell A1 the data extracted needs to be pasted in A2 and below.
This is my code for google search, I have got it working so the search comes from cell A1, I am look for something like this, I am going to see if I can use the ebay code with this. As this also does the first 25 pages in google search
enter Sub webpage()
Dim ie As Object
Dim htmlDoc As Object
Dim nextPageElement As Object
Dim div As Object
Dim link As Object
Dim url As String
Dim pageNumber As Long
Dim i As Long
' Takes seach from A1 and places it into google
url = "https://www.google.co.uk/search?q=" & Replace(Worksheets("Sheet1").Range("A1").Value, " ", "+")
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate url
Do While .Busy Or .readyState <> 4
DoEvents
Loop
End With
Application.Wait Now + TimeSerial(0, 0, 5)
Set htmlDoc = ie.document
pageNumber = 1
i = 2
Do
For Each div In htmlDoc.getElementsByTagName("div")
If div.getAttribute("class") = "r" Then
Set link = div.getElementsByTagName("a")(0)
Cells(i, 2).Value = link.getAttribute("href")
i = i + 1
End If
Next div
If pageNumber >= 25 Then Exit Do 'the first 25 pages
Set nextPageElement = htmlDoc.getElementById("pnnext")
If nextPageElement Is Nothing Then Exit Do
' Clicks web next page
nextPageElement.Click 'next web page
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
Application.Wait Now + TimeSerial(0, 0, 5)
Set htmlDoc = ie.document
pageNumber = pageNumber + 1
Loop
MsgBox "All Done"
Set ie = Nothing
Set htmlDoc = Nothing
Set nextPageElement = Nothing
Set div = Nothing
Set link = Nothing
End Sub
code here
Question 1: Why does it work for one domain but not the other?
To answer question 1 (the other questions should be new posts) - the html is not the same at all. The classes which work for ebay.com are not found in ebay.co.uk; So, your loop over collections doesn't do anything because they are count 0 (or length 0 with nodeLists if using querySelectorAll). Instead, you need branched code. Set your selectors based on the url domain.
I have used css selectors as this is the easiest, and fastest way, to select the required elements whilst maintaining the flexibility of a code re-factor to reduce the lines of repeated code.
Side note:
If you are unsure about whether your selection method will work across different pages you can do at least two things:
Right click > inspect element > visually check the class names are the same for the elements you are attempting to compare. So, if you are looking at product names, are the class names in the html the same on both pages?
You can use the search facility of the browser > open element tab via F12 then press Ctrl+F to pull up search box > enter your class name from the first page into this box in the second page and hit enter. You can also enter css selectors here and some cases regex. You will get a hit count telling you how many matches found. You can keep pressing enter to cycle through matches and each match will be highlighted in the html above, so you can easily compare if matched results are what you expected.
click image to enlarge
img url: https://i.stack.imgur.com/MWkEx.png
VBA:
Option Explicit
Public Sub GetData()
Dim htmlDoc As MSHTML.HTMLDocument, ie As SHDocVw.InternetExplorer, ws As Worksheet
Set ie = New SHDocVw.InternetExplorer
Set htmlDoc = New MSHTML.HTMLDocument
Set ws = ThisWorkbook.Worksheets("Sheet1")
With ie
.Visible = True
'.Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=jackets&_sacat=0"
.Navigate2 "https://www.ebay.com/sch/i.html_from=R40&_nkw=ralph+lauren&_sacat=1059&LH_TitleDesc=0&_dmd=1&rt=nc"
While .Busy Or .readyState <> 4: DoEvents: Wend
Dim index As Long, HTMLItems As Object, rowNum As Long, xCell As Range
Dim cssSelectors(), i As Long
Select Case True
Case InStr(.document.URL, "ebay.co.uk") > 0
cssSelectors = Array(".gvtitle a", ".amt", ".gvtitle a")
Case InStr(.document.URL, "ebay.com") > 0
cssSelectors = Array(".s-item__title", ".s-item__price", ".s-item__link")
End Select
With ws
For i = LBound(cssSelectors) To UBound(cssSelectors)
rowNum = 1
Set HTMLItems = ie.document.querySelectorAll(cssSelectors(i))
For index = 0 To HTMLItems.length - 1
.Cells(rowNum, i + 1).Value = IIf(i = 2, HTMLItems.item(index).getAttribute("href"), HTMLItems.item(index).innerText)
rowNum = rowNum + 1
Next
Next
For Each xCell In .Range("C1:C25000") '<= all these really?
.Hyperlinks.Add Anchor:=xCell, Address:=xCell.Formula
Next xCell
End With
.Quit
End With
End Sub
If this works on eBay then you need to find out yourself why it's not working on ebay.co.uk. My point is if the code itself works than there is nothing we can help you with here. You need to take some time to investigate ebay.co.uk and find the differences as I am sure it's something minor. I can't help u fix code that isn't actually broken. I wish you luck though.

Excel Macro To Pull Google Image Links

The goal is to get images from Google Images that match the part numbers in my database. My code runs, and it pulls up the correct Google pages but refuses to put the links into the spreadsheet. I have tried everything I can think of, but as of now, I keep on getting Error 1004 (Application-defined or Object-defined error).`
Sub SearchBotGoogleImgLink()
Dim objIE As Object
Set IE = CreateObject("InternetExplorer.Application")
Dim HTMLdoc As HTMLDocument
Dim imgElements As IHTMLElementCollection
Dim imgElement As HTMLImg
Dim aElement As HTMLAnchorElement
Dim n As Integer
Dim i As Integer
Dim url As String
Dim url2 As String
Dim m As Long
Dim lastRow As Long
Dim url3 As String
Dim SearchRow As Long
Dim aEle As HTMLLinkElement
Worksheets("Sheet1").Select
SearchRow = 1
Do Until IsEmpty(ActiveSheet.Cells(SearchRow, 1))
Sheets("Sheet1").Select
Application.StatusBar = SearchRow - 1 & " of " & "4368" & " Items Done"
Item = Trim(ActiveSheet.Cells(SearchRow, 1))
url = "https://www.google.com/search?hl=en&biw=1600&bih=796&tbm=isch&sa=1&ei=CTOpW_2jO6nAjwT67rqACw&q=A2N0015C3KUU&oq=" & Cells(SearchRow, 1) & "&oq=A2N0015C3KUU&gs_l=img.12...0.0..1704...0.0..0.0.0.......1......gws-wiz-img.9wB6WwQJhwA"
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate url
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
For Each aEle In objIE.document.getElementsByTagName("IMG")
result = aEle
Sheets("Sheet1").Range(SearchRow & "C").Value = result
Sheets("Sheet1").Range(SearchRow & "D") = aEle.innerHTML
Sheets("Sheet1").Range(SearchRow & "F").Value = aEle.innerText
Debug.Print aEle.innerText
Next
Loop
'For i = 1 To lastRow
'url = "https://www.google.com/search?hl=en&biw=1600&bih=796&tbm=isch&sa=1&ei=CTOpW_2jO6nAjwT67rqACw&q=A2N0015C3KUU&oq=" & Cells(SearchRow, 1) & "&oq=A2N0015C3KUU&gs_l=img.12...0.0..1704...0.0..0.0.0.......1......gws-wiz-img.9wB6WwQJhwA"
Set HTMLdoc = objIE.document
Set imgElements = HTMLdoc.getElementsByTagName("IMG")
n = 1
For Each imgElement In imgElements
If InStr(ingElement.src, sImageSearchString) Then
If imgElement.ParentNode.nodeName = "A" Then
Set aElement = imgElement.ParentNode
If n = 2 Then
url2 = aElement.href 'imgElement.src
url3 = imgElement.src 'aElement.href
n = n + 1
End If
End If
End If
Next
Cells(SearchRow, 5) = url2
IE.Quit
Set IE = Nothing
End Sub
Notes on your code:
You need Option Explicit at the top of your code to check on variable declarations and typos amongst other advantages. There are a number of missing declarations e.g. result, and used ones later e.g. Set IE = CreateObject("InternetExplorer.Application"). You have two different variables (one late bound and one early) both creating IE instances. You only in fact use one.
Your current error may be down to you trying to work with an object here:
result = aEle which won't work without the Set keyword to provide the required reference.
Without example URLs and expected output it is difficult to advise on the later loops in your code. You appear to have a duplicate loop over IMG elements but this time with some restrictions. It is likely these loops can be merged.
An example:
The following uses an arbitrary concatenation in to pull the img src links in from search results based on A2N0015C3KUU.
It uses a CSS selector combination of #ires img[src] to target elements with img tags and src attributes within the parent element with id ires (search results).
It is to demonstrate the principle of gathering aNodeList of matching elements and writing out to a sheet. The querySelectorAll method applied the CSS selector combination to the HTMLDocument and returns the nodeList. The nodeList is looped along its .Length, with items accessed by index starting at 0.
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer
With IE
.Visible = True
.navigate "https://www.google.com/search?hl=en&biw=1600&bih=796&tbm=isch&sa=1&ei=CTOpW_2jO6nAjwT67rqACw&q=A2N0015C3KUU&oq=1&%20%22&oq=A2N0015C3KUU&gs_l=img.12...0.0..1704...0.0..0.0.0.......1......gws-wiz-img.9wB6WwQJhwA"
While .Busy Or .readyState < 4: DoEvents: Wend
Dim aNodeList As Object, i As Long
Set aNodeList = IE.document.querySelectorAll("#ires img[src]")
For i = 0 To aNodeList.Length - 1
ActiveSheet.Cells(i + 2, 4) = aNodeList.item(i).src
Next
'Quit '<== Remember to quit application
End With
End Sub

VBA Issues Pulling Info From Web and Putting It in Excel

I'm trying to write a script to pull doctor reviews from vitals.com and put them into an excel sheet.
It worked well when I just pulled the review, but when I added for it to pull the date as well, it will print the first review and date, then loads for a while, and then crashes. I'm new to all of this so I'm hoping there are some glaring mistakes I am not seeing. I just can't seem to find a way to fix it. Any help would be greatly appreciated.
Private Sub Worksheet_Change(ByVal Target As Range)
Dim DocCounter As Integer
DocCounter = 2
Dim Go As String
Go = "Go"
If IsEmpty(Cells(1, 4)) And Cells(1, 3).Value = Go Then
If IsEmpty(Cells(DocCounter, 1).Value) Then GoTo EmptySheet
Do
Dim Reviews As String
Reviews = "/reviews"
Dim IE As MSXML2.XMLHTTP60
Set IE = New MSXML2.XMLHTTP60
Application.Wait (Now + TimeValue("0:00:01"))
IE.Open "get", "http://vitals.com/doctors/" & Cells(DocCounter, 1).Value & Reviews, True
IE.send
While IE.readyState <> 4
DoEvents
Wend
Application.Wait (Now + TimeValue("0:00:01"))
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLBody As MSHTML.HTMLBody
Set HTMLDoc = New MSHTML.HTMLDocument
Set HTMLBody = HTMLDoc.body
HTMLBody.innerHTML = IE.responseText
Dim ReviewCounterString As String
Dim ReviewCounter As Integer
ReviewCounterString = HTMLDoc.getElementsByName("overall_total_reviews")(0).getElementsByTagName("h3")(0).innerText
ReviewCounter = CInt(ReviewCounterString)
'Pull info from website loop'
Dim RC As Integer
RC = 2
Dim sDD As String
Dim WebCounter As Integer
WebCounter = 0
Do
sDD = HTMLDoc.getElementsByClassName("date c_date dtreviewed")(WebCounter).innerText & "-" & HTMLDoc.getElementsByClassName("description")(WebCounter).innerText
Cells(DocCounter, RC).Value = sDD
WebCounter = WebCounter + 1
RC = RC + 1
Application.Wait (Now + TimeValue("0:00:01"))
Loop Until WebCounter = ReviewCounter
Application.Wait (Now + TimeValue("0:00:01"))
DocCounter = DocCounter + 1
If IsEmpty(Cells(DocCounter, 1).Value) Then GoTo Finished
Loop
Finished:
MsgBox ("Complete")
End Sub
EmptySheet:
MsgBox ("The Excel Sheet is Empty. Please add Doctors.")
End Sub
End If
End Sub
When you do Cells(DocCounter, RC).Value = sDD the Worksheet.Change event gets triggered again and the macro starts over again, until the call stack is full (I think).
Add
Application.EnableEvents = False
at the start of the macro and
Application.EnableEvents = True
at the end. That way the event will not be triggered during the macro.
Edit: You should probably also think about if it's really necessary to run the macro every time anything is changed anywhere on the sheet. You could check Target (the range that was changed) first to see if the change makes it necessary to reload the data.

Resources