Excel VBA Page Pagination, FOR LOOP not working - excel

I am using a For Loop to extract some data from a site, with some code for page navigation through the pages. The problem I am having is that the that after the code navigates to the next page the For Loop is not extracting anything.
What works
The FOR LOOP works fine on the first page
The page pagination code works fine on navigating to the next page/s.
The Problem
*
Once the code has navigated to the next page the FOR LOOP is NOT
working, it will extract NO data.
I cannot work it out
This is what I am working with minus my code as it was a bit too long
Private Sub CommandButton4_Click()
'dimension (declare or set aside memory for) our variables
Dim Html As HTMLDocument
Dim objIE As Object
Dim y As Integer 'integer variable we'll use as a counter
Dim result As String 'string variable that will hold our result link
Dim pageNumber As Long ' page no.
Dim nextPageElement As Object 'page element
Dim lastrow As Long
Dim HtmlText As Variant ' for html data
Dim wsSheet As Worksheet ' WorkSheet
Dim wb As Workbook
Set wb = ThisWorkbook
Set wsSheet = wb.Sheets("Sheet3")
'+++++ Internet Explorer ++++++
Set objIE = New InternetExplorer 'initiating a new instance of Internet Explorer and asigning it to objIE
objIE.Visible = True 'make IE browser visible (False would allow IE to run in the background)
objIE.navigate Sheets("Sheet3").Range("A2").Value & Replace(Worksheets("Sheet3").Range("B2") & Range("C2").Value, " ", "+") 'navigate IE to this web page
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop 'wait here a few seconds while the browser is busy
y = 2
Set Html = objIE.document
Set elements = Html.getElementsByClassName("sresult lvresult clearfix li shic") ' parent CLASS
'FOR LOOP
For Each element In elements
DoEvents
'############################ My code here #################
'COUNTER AND NEXT ELEMENT
y = y + 1
Next element
' #################### START OF Page Navigation will delay ###############
Do
'Number Of Pages to navigate come from sheet3 - D2
If pageNumber >= Replace(Worksheets("Sheet3").Range("D2").Value, " ", "+") Then Exit Do
Set nextPageElement = Html.getElementsByClassName("gspr next")(0)
If nextPageElement Is Nothing Then Exit Do
'Random delay from Max number entered in Sheet3 E2
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet3").Range("E2").Value))
nextPageElement.Click 'next web page
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'SECOND Random delay from Max number entered in Sheet4 f2
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet3").Range("F2").Value))
Set Html = objIE.document
pageNumber = pageNumber + 1
Loop
' ########################## End of Page Navigation with Delays ###############
objIE.Quit ' end and clear browser
Set ie = Nothing
Set htmlDoc = Nothing
Set nextPageElement = Nothing
Set div = Nothing
Set link = Nothing
MsgBox "All Done"
End Sub
I tried to add this to the end of "pageNumber = pageNumber + 1" to see if it would work but got not luck
pageNumber = pageNumber + 1
Set elements = Html.getElementsByClassName("sresult lvresult clearfix li shic")
For Each element In elements
DoEvents
Next element

Related

Web page navigation reverting back to page 1

I am pulling some data of yellowpages, which is pulling off fine. However my issue is around the page navigation. Although It navigates fine from page 1 to 2 when it trys to navigate to page 3 my code goes back to page 1 and extracts the data again. The data extraction is fine the issue is the navigation.
YellowPage.ca
This is what I have identified and I think is the issue, but do not know how to resolve it.
When the page navigates to page 2, the class for the 'emptyPageButton' changes to the same class to navigate to the NEXT PAGE, so instead of going forward to the next page, which would be page 3, it goes back to page 1. If I stated that 10 pages should be extracted it will extract each page 1 + 2 five times each as it will keep going back and forth between the two pages.
I have made several attempts, but they do not work. I can get as far as page2 and then it goes back to page 1
WITH CLASS works up to page 2 then goes back to page 1
''' Searches Number of Pages entered in Sheet20 rage J9
If pageNumber >= Replace(Worksheets("Sheet20").Range("J9").Value, "", "+") Then Exit Do
Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton ")(0)
'Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton ")(1)
'Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton ")(0).children (0)
'Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton ")(1).children (0)
'Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton ")(1).children (1)
'Set nextPageElement = HTML.getElementsByClassName("view_more_section_noScroll ")(0).getElementsByTagName("a")(1)
If nextPageElement Is Nothing Then Exit Do
nextPageElement.Click 'next web page
Application.Wait Now + TimeValue("00:00:05")
WITH QUERY SELECTOR works up to page 2 then goes back to page 1
''' Searches Number of Pages entered in Sheet20 rage J9
If pageNumber >= Replace(Worksheets("Sheet20").Range("J9").Value, "", "+") Then Exit Do
Set nextPageElement = HTML.querySelector(".view_more_section_noScroll .pageButton")
If Not nextPageElement Is Nothing Then
nextPageElement.Click
Application.Wait Now + TimeValue("00:00:05")
Else:
Exit Do
End If
Snippet for page1
<div class="view_more_section_noScroll">
<div class="emptyPageButton"></div>
<span class="pageCount">
<span class="bold">
1 /
</span>
<span class="">
37</span>
</span>
<a href="/search/si/2/car+dealership/Toronto+ON" data-analytics="{"event_name":"click - load_more - Serp ","lk_se_id":"f32f0ee7-8492-46dd-87da-7b621c162879_Y2FyIGRlYWxlcnNoaXA_VG9yb250byBPTg","lk_name":"next_serp"}"
class="ypbtn btn-theme pageButton">Next
>></a>
</div>
Snippet for page2 and beyond
<div class="view_more_section_noScroll">
<a href="/search/si/1/car+dealership/Toronto+ON" data-analytics="{"event_name":"click - previous_page - Serp ","lk_se_id":"f32f0ee7-8492-46dd-87da-7b621c162879_Y2FyIGRlYWxlcnNoaXA_VG9yb250byBPTg","lk_name":"previous_serp"}"
class="ypbtn btn-theme pageButton"><< Previous</a>
<span class="pageCount">
<span class="bold">
2 /
</span>
<span class="">
37</span>
</span>
<a href="/search/si/3/car+dealership/Toronto+ON" data-analytics="{"event_name":"click - load_more - Serp ","lk_se_id":"f32f0ee7-8492-46dd-87da-7b621c162879_Y2FyIGRlYWxlcnNoaXA_VG9yb250byBPTg","lk_name":"next_serp"}"
class="ypbtn btn-theme pageButton">Next
>></a>
</div>
QUESTION, Can someone advise what the correct class or querySelector is for the navigation?
Results
As aways thanks in advance.
'''########################## UPDATED THUR 8/4/2021 #####################
The full code is large, I have reduced the code a lot to make it much easier to read as the ONLY ISSUE is the page navigation. This code should give you and idea of what i am trying to do. Currently it overides previous extracted results, I have deleted something in the code by error, please ignore this for now as ONLY THE PAGE NAVIGATION IS AN ISSUE
Private Sub YellowPagesCa()
Dim HTML As htmlDocument
Dim objIE As Object
Dim result As String 'string variable that will hold our result link
Dim pageNumber As Long ' page no.
Dim nextPageElement As Object 'page element
Dim HtmlText As Variant ' for html data
Dim wsSheet As Worksheet ' WorkSheet
Dim wb As Workbook
Dim sht As Worksheet
Set wb = ThisWorkbook
Set wsSheet = wb.Sheets("YellowPages")
Set sht = ThisWorkbook.Worksheets("YellowPages")
'+++++ Internet Explorer ++++++
Set objIE = New InternetExplorer 'initiating a new instance of Internet Explorer and asigning it to objIE
objIE.Visible = True
objIE.navigate "https://www.yellowpages.ca/search/si/1/car+dealer/Toronto+ON"
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop 'wait here a few seconds while the browser is busy
Set HTML = objIE.document
Set elements = HTML.getElementsByClassName("listing_right_section")
For Each element In elements
DoEvents
''' Element 1
If element.getElementsByClassName("listing__name--link listing__link jsListingName")(0) Is Nothing Then
wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = "-"
Else
HtmlText = element.getElementsByClassName("listing__name--link listing__link jsListingName")(0).href
wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = HtmlText
End If
'End If
Next element
Do
'''############### PAGE NAVIGATION ##############
'Searches Number of Pages entered in
If pageNumber >= 5 Then Exit Do 'Replace(Worksheets("Sheet20").Range("J9").Value, "", "+") Then Exit Do
Set nextPageElement = HTML.querySelector(".view_more_section_noScroll .pageButton")
' Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton ")(0)
If Not nextPageElement Is Nothing Then
nextPageElement.Click
Application.Wait Now + TimeValue("00:00:05")
Else:
Exit Do
End If
Do While objIE.Busy = True Or objIE.readyState <> 4
DoEvents
Loop
Set HTML = objIE.document
pageNumber = pageNumber + 1
Loop
objIE.Quit ' end and clear browser
Set objIE = Nothing
Set HTML = Nothing
Set nextPageElement = Nothing
Set HtmlText = Nothing
Set element = Nothing
Complete.show
'End If
End Sub
You could loop while
ie.document.querySelectorAll(".pageCount + a").Length <> 0
and
click the next button inside that loop with:
ie.document.querySelector(".pageCount + a").click
or
ie.Navigate2 ie.document.querySelector(".pageCount + a").href
This will terminate when there is no more next button.
Alternatively, extract the page count from the first page and loop to that number of pages, substituting the current page number into the url (e.g. replacing 1 with 2 to get page 2)
Option Explicit
Public Sub PrintSomeInfo()
Dim ie As SHDocVw.InternetExplorer, re As Object
Set ie = New SHDocVw.InternetExplorer
Set re = CreateObject("VBScript.RegExp")
With re
.Global = False
.MultiLine = False
.Pattern = "(si\/)(\d+)(\/)"
End With
With ie
.Visible = True
.Navigate2 "https://www.yellowpages.ca/search/si/1/car+dealership/Toronto+ON"
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Dim pageCount As Long, i As Long
pageCount = CLng(.document.querySelector(".pageCount .bold + span").innerText)
'already on page one so just loop from 2 to pageCount
For i = 2 To pageCount
.Navigate2 re.Replace(.document.url, "$1" & CStr(i) & "$3")
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
'do something with new page
Next
Stop
.Quit
End With
End Sub
Regex:
The regex pattern matches 3 groups in the url and then substitutes the second group, the current page number, with the new page number:
Thanks to QHarr answer I was able to fix the issue by using parts of it. I have used my Class and QuerySelector code with parts of QHarr QuerySelector answer. I can now navigate the pages fine.
Do
' Searches Number of Pages entered in Sheet20 J9
If pageNumber >= Replace(Worksheets("Sheet20").Range("J9").Value, "", "+") Then Exit Do
'Set nextPageElement = HTML.querySelector(".view_more_section_noScroll .pageButton")
Set nextPageElement = HTML.getElementsByClassName("ypbtn btn-theme pageButton")(0) '' using class and NOT QuerySelector here
If Not nextPageElement Is Nothing Then
nextPageElement.document.querySelector(".pageCount + a").Click ''NEW PART
Application.Wait Now + TimeValue("00:00:05")
Else:
Exit Do
End If

VBA for Web Scraping works as Sub but Not as Function

I have written a VBA to scrape the status of a shipment from a cargo tracking site with the help of you guys here. I am trying to convert it to a function. The code works as a sub but does not work as a function. It returns a #Value error. Can someone please tell me what I am doing wrong.
Here is the code as a sub
Sub FlightStat_AFL()
Dim url As String
Dim ie As Object
Dim MAWBStatus As String
Dim MAWBNo As String
MAWBNo = Sheets("Sheet3").Range("H3").Value
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-" & MAWBNo
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MAWBStatus = ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innertext
Debug.Print MAWBStatus
'Clean up
ie.Quit
Set ie = Nothing
End Sub
Here is the code I am trying to make it work as a function.
Function FlightStat_AF(MAWBNo As Range) As String
Dim url As String
Dim ie As Object
Dim MAWBStatus As String
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-" & MAWBNo
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MAWBStatus = ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innertext
FlightStat_AF = MAWBStatus
'Clean up
ie.Quit
Set ie = Nothing
End Function
Try the next code, please:
Function FlightStat_AF(cargoNo As Variant) As String
Dim url As String, ie As Object, result As String
url = "https://www.afklcargo.com/mycargo/shipment/detail/" & cargoNo
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = False
.navigate url
Do Until .readyState = 4: DoEvents: Loop
End With
'wait a little for dynamic content to be loaded
Application.Wait (Now + TimeSerial(0, 0, 1))
'Get the status from the table
Do While result = ""
DoEvents
On Error Resume Next
result = Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
On Error GoTo 0
Application.Wait (Now + TimeSerial(0, 0, 1))
Loop
ie.Quit: Set ie = Nothing
'Return value of the function
FlightStat_AF = result
End Function
IE Function
You can try this if you really want a range. Usually it should be a string which you can easily change.
You can test the function (2nd procedure) with the first procedure. Just adjust the values in the constants section.
The Code
Option Explicit
Sub getFlightStat()
' Constants
Const wsName As String = "Sheet3"
Const FirstRow As Long = 3
Const CritCol As Variant = "H"
Const ResCol As Variant = "I"
Dim wb As Workbook: Set wb = ThisWorkbook
' Define worksheet.
Dim ws As Worksheet
Set ws = wb.Worksheets(wsName)
' Calculate the row of the last non-blank cell in column 'CritCol'.
Dim LastRow As Long
LastRow = ws.Cells(ws.Rows.Count, CritCol).End(xlUp).Row
' Loop through rows and for each value in cell of column 'CritCol',
' write the value retrieved via 'FlightStat_AF' to the cell
' in the same row, but in column 'ResCol'.
Dim i As Long
For i = FirstRow To LastRow
ws.Cells(i, ResCol).Value = FlightStat_AF(ws.Cells(i, CritCol))
Next i
' Inform user.
MsgBox "Data transferred", vbInformation, "Success"
End Sub
Function FlightStat_AF(MAWBNo As Range) As String
Dim url As String
Dim ie As Object
Dim MAWBStatus As String
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-" & MAWBNo.Value
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MAWBStatus = ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innertext
FlightStat_AF = MAWBStatus
'Clean up
ie.Quit
Set ie = Nothing
End Function

Bing scraper navigation

My code all of a sudden does not seem to want to work and I can not work out why. Most of it is fine, there are only two issue with it now and I can not work them out. I was trying to improve it and may have moved someting around and can not work out what I have done. I need a some one to look at this for me please. I have highlighted the issue in the code that are the problem. The bulk of this is fine and I am happy with it.
It no longer clicks on the next page in Bing
It Loops minus 2, so if I put in 10 loops then it does 8. Maybe I have put the loop counter in the wrong place, hence it could be showing wrong results
Its not the best code in the world, but it was something I wrote and it worked fine up until a few weeks back. I have been trying to fix it but can not work it out so decided to post.
What I have done so Far.
I tried to google the answer
I tried to fix it myself. I managed to fix other smaller bugs but can't fix navigation
I Checked the BING CLASS for next page, looks the same to me. I have always used the one in yellow
Private Sub BingScraper()
'''Bing URL SCRAPER
Dim ie As Object
Dim HTMLdoc As Object
Dim nextPageElements As Object
Dim li As Object
Dim link As Object
Dim url As String
Dim pageNumber As Long
Dim i As Long
Dim myCounter As Long
'''Takes seach from Sheet10 to google
url = "https://www.Bing.com/search?q=" & Replace(Worksheets("Sheet10").Range("G17").Value & Range("H17").Value, " ", "+")
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate url
Do While .Busy Or .readyState <> 4
DoEvents
Loop
End With
Application.Wait Now + TimeSerial(0, 0, 5)
Set HTMLdoc = ie.document
'''Searches URLS and places them in Sheet called Sheet2 ROW 2 Column A
With Sheets("Sheet2")
pageNumber = 2
'i = 2
i = Sheets("Sheet2").Cells(Rows.Count, "A").End(xlUp).row + 1
Do
For Each li In HTMLdoc.getElementsByTagName("li")
' Application.ScreenUpdating = False
If li.getAttribute("class") = "b_algo" Then
Set link = li.getElementsByTagName("a")(0)
.Cells(i, 1).Value = link.getAttribute("href")
i = i + 1
End If
Next li
'#####################################################################################
'################################# ISSUE STARTS FROM HERE ############################
'''Searches Number of Pages entered in Sheet10 i17
If pageNumber >= Replace(Worksheets("Sheet10").Range("I17").Value, " ", "+") Then Exit Do
On Error Resume Next
'#################### THIS IS THE CLASS FOR BING NEXT PAGE ################
Set nextPageElements = HTMLdoc.getElementsByClassName("sb_pagN sb_pagN_bp b_widePag sb_bp")(0)
If nextPageElements Is Nothing Then Exit Do
'''Scrolls Down the Browser
ie.document.parentWindow.Scroll 0&, 99999
'''Random delay from Max number entered in Sheet10
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet10").Range("J17").Value))
'######################## NO LONGER GOING TO NEXT PAGE ######################
nextPageElement.Click 'next web page
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
'''Random delay from Max number entered in Sheet10
Application.Wait Now + TimeSerial(0, 0, Application.RandBetween(1, Worksheets("Sheet10").Range("K17").Value))
Set HTMLdoc = ie.document
''' Delete duplicates
Sheet2.Columns("A").RemoveDuplicates Columns:=Array(1), Header:=xlYes
pageNumber = pageNumber + 1
'######################### LOOP COUNTER ######################
myCounter = myCounter + 1
Worksheets("Sheet10").Range("G6").Value = myCounter
Loop
''' Delete Row If Blank
Sheet2.Columns("A").SpecialCells(xlCellTypeBlanks).EntireRow.Delete
End With
'################################# ISSUE END HERE ############################
'#############################################################################
If Sheet10.Range("I17") = 0 Then
Complete.Show
Termination.Hide
ElseIf Sheet10.Range("I17") > 0 Then
Complete.Show
End If
ie.Quit
Set ie = Nothing
Set HTMLdoc = Nothing
Set nextPageElements = Nothing
Set li = Nothing
Set link = Nothing
End Sub

eBay Product scraper

I am very limited on VBA,
The Code is in a Module, the code has a sub process as well, so sorry if I post the code wrong
A) open IE
B) Subprocess gets the data.
The code works fine on ebay.com but NOT for ebay.co.uk - can't work out why, also it converts urls to hyperlinks
It only does the first page, I need it to go through an X amount of pages - have a code but can't get it to work so have removed it.
Can the search query be run AFTER Ebay opens, so it opens, then search item is input to ebay and then code runs, or to run from a cell, IF its Cell A1 the data extracted needs to be pasted in A2 and below.
I have looked at elements for ebay.com and ebay.co.uk and they look the same to me, so can't work out why its not working as it works for 1 and not the other.
I did input the code for getting data from several pages it did not work. I know this code works as I have it for when I fetch urls from google
Public IE As New SHDocVw.InternetExplorer
Sub GetData()
Dim HTMLdoc As MSHTml.HTMLDocument
Dim othwb As Variant
Dim objShellWindows As New SHDocVw.ShellWindows
Set IE = CreateObject("internetexplorer.application")
With IE
.Visible = True
'.Navigate "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=jackets&_sacat=0"
.Navigate "https://www.ebay.com/sch/i.html_from=R40&_nkw=ralph+lauren&_sacat=1059&LH_TitleDesc=0&_dmd=1&rt=nc"
While .Busy Or .readyState <> 4: DoEvents: Wend
Set HTMLdoc = IE.document
ProcessHTMLPage HTMLdoc
.Quit
End With
End Sub
code here
enter
'''''' THIS IS THE SUB PROCESS '''''
Sub ProcessHTMLPage(HTMLPage As MSHTml.HTMLDocument)
Dim HTMLItem As MSHTml.IHTMLElement
Dim HTMLItems As MSHTml.IHTMLElementCollection
Dim HTMLInput As MSHTml.IHTMLElement
Dim rownum As Long
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__title")
For Each HTMLItem In HTMLItems
Cells(rownum, 1).Value = HTMLItem.innerText
rownum = rownum + 1
Next HTMLItem
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__price")
For Each HTMLItem In HTMLItems
Cells(rownum, 2).Value = HTMLItem.innerText
rownum = rownum + 1
Next HTMLItem
rownum = 1
Set HTMLItems = HTMLPage.getElementsByClassName("s-item__link")
For Each HTMLItem In HTMLItems
Cells(rownum, 3).Value = HTMLItem.href
rownum = rownum + 1
Next HTMLItem
'Converts each text hyperlink selected into a working hyperlink from C1 to 25000 rows
Range("C1:C25000").Select
For Each xCell In Selection
ActiveSheet.Hyperlinks.Add Anchor:=xCell, Address:=xCell.Formula
Next xCell
Range("C1").Select
End Sub
Code for going to next page
pageNumber = 1
'i = 2
If pageNumber >= 6 Then Exit Do 'the first 6 pages
internetdata.getElementById("pnnext").click 'next web page
Do While internet.Busy Or internet.readyState <> 4
DoEvents
Loop
Set internetdata = internet.document
pageNumber = pageNumber + 1
Loop
Does not work on Ebay.co.uk - NO RESULTS ARE EXTRACTED - Works fine in ebay.com
Need it to get data from X amount of pages and NOT just 1 page
Can the search query be run AFTER Ebay opens, so it opens, then search item is input to ebay and then code runs, or to run from a cell, IF its Cell A1 the data extracted needs to be pasted in A2 and below.
This is my code for google search, I have got it working so the search comes from cell A1, I am look for something like this, I am going to see if I can use the ebay code with this. As this also does the first 25 pages in google search
enter Sub webpage()
Dim ie As Object
Dim htmlDoc As Object
Dim nextPageElement As Object
Dim div As Object
Dim link As Object
Dim url As String
Dim pageNumber As Long
Dim i As Long
' Takes seach from A1 and places it into google
url = "https://www.google.co.uk/search?q=" & Replace(Worksheets("Sheet1").Range("A1").Value, " ", "+")
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate url
Do While .Busy Or .readyState <> 4
DoEvents
Loop
End With
Application.Wait Now + TimeSerial(0, 0, 5)
Set htmlDoc = ie.document
pageNumber = 1
i = 2
Do
For Each div In htmlDoc.getElementsByTagName("div")
If div.getAttribute("class") = "r" Then
Set link = div.getElementsByTagName("a")(0)
Cells(i, 2).Value = link.getAttribute("href")
i = i + 1
End If
Next div
If pageNumber >= 25 Then Exit Do 'the first 25 pages
Set nextPageElement = htmlDoc.getElementById("pnnext")
If nextPageElement Is Nothing Then Exit Do
' Clicks web next page
nextPageElement.Click 'next web page
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
Application.Wait Now + TimeSerial(0, 0, 5)
Set htmlDoc = ie.document
pageNumber = pageNumber + 1
Loop
MsgBox "All Done"
Set ie = Nothing
Set htmlDoc = Nothing
Set nextPageElement = Nothing
Set div = Nothing
Set link = Nothing
End Sub
code here
Question 1: Why does it work for one domain but not the other?
To answer question 1 (the other questions should be new posts) - the html is not the same at all. The classes which work for ebay.com are not found in ebay.co.uk; So, your loop over collections doesn't do anything because they are count 0 (or length 0 with nodeLists if using querySelectorAll). Instead, you need branched code. Set your selectors based on the url domain.
I have used css selectors as this is the easiest, and fastest way, to select the required elements whilst maintaining the flexibility of a code re-factor to reduce the lines of repeated code.
Side note:
If you are unsure about whether your selection method will work across different pages you can do at least two things:
Right click > inspect element > visually check the class names are the same for the elements you are attempting to compare. So, if you are looking at product names, are the class names in the html the same on both pages?
You can use the search facility of the browser > open element tab via F12 then press Ctrl+F to pull up search box > enter your class name from the first page into this box in the second page and hit enter. You can also enter css selectors here and some cases regex. You will get a hit count telling you how many matches found. You can keep pressing enter to cycle through matches and each match will be highlighted in the html above, so you can easily compare if matched results are what you expected.
click image to enlarge
img url: https://i.stack.imgur.com/MWkEx.png
VBA:
Option Explicit
Public Sub GetData()
Dim htmlDoc As MSHTML.HTMLDocument, ie As SHDocVw.InternetExplorer, ws As Worksheet
Set ie = New SHDocVw.InternetExplorer
Set htmlDoc = New MSHTML.HTMLDocument
Set ws = ThisWorkbook.Worksheets("Sheet1")
With ie
.Visible = True
'.Navigate2 "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=jackets&_sacat=0"
.Navigate2 "https://www.ebay.com/sch/i.html_from=R40&_nkw=ralph+lauren&_sacat=1059&LH_TitleDesc=0&_dmd=1&rt=nc"
While .Busy Or .readyState <> 4: DoEvents: Wend
Dim index As Long, HTMLItems As Object, rowNum As Long, xCell As Range
Dim cssSelectors(), i As Long
Select Case True
Case InStr(.document.URL, "ebay.co.uk") > 0
cssSelectors = Array(".gvtitle a", ".amt", ".gvtitle a")
Case InStr(.document.URL, "ebay.com") > 0
cssSelectors = Array(".s-item__title", ".s-item__price", ".s-item__link")
End Select
With ws
For i = LBound(cssSelectors) To UBound(cssSelectors)
rowNum = 1
Set HTMLItems = ie.document.querySelectorAll(cssSelectors(i))
For index = 0 To HTMLItems.length - 1
.Cells(rowNum, i + 1).Value = IIf(i = 2, HTMLItems.item(index).getAttribute("href"), HTMLItems.item(index).innerText)
rowNum = rowNum + 1
Next
Next
For Each xCell In .Range("C1:C25000") '<= all these really?
.Hyperlinks.Add Anchor:=xCell, Address:=xCell.Formula
Next xCell
End With
.Quit
End With
End Sub
If this works on eBay then you need to find out yourself why it's not working on ebay.co.uk. My point is if the code itself works than there is nothing we can help you with here. You need to take some time to investigate ebay.co.uk and find the differences as I am sure it's something minor. I can't help u fix code that isn't actually broken. I wish you luck though.

Looping through a row and copying each cell In a specific procedure

What I have to do is use Excel VBA to:
login to Amazon Seller
open a workbook
loop through a column to get an order number
put it in the search box
hit the search button
go to the order page and extract the data
then have the extracted data go back into a specified column in
another Excel workbook
The loop and order number parts are what I'm currently stumped on. I've figured out this much code as of this moment:
Sub MyAmazonSeller()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim oSignInLink As HTMLLinkElement
Dim oInputEmail As HTMLInputElement
Dim oInputPassword As HTMLInputElement
Dim oInputSigninButton As HTMLInputButtonElement
'InputSearchOrder will be the destination for order numbers taken from the workbook
Dim InputSearchOrder As HTMLInputElement
Dim InputSearchButton As HTMLInputButtonElement
Dim IE As InternetExplorer
Dim AAOrder As Workbook
Dim AAws As Worksheet
MyURL = "https://sellercentral.amazon.com/gp/homepage.html"
Set IE = New InternetExplorer
' Open the browser and navigate.
With IE
.Silent = True
.Navigate MyURL
.Visible = True
Do
DoEvents
Loop Until .ReadyState = READYSTATE_COMPLETE
End With
' Get the html document.
Set HTMLDoc = IE.Document
' See if you have the sign in link is because you are in the main
' page
Set oSignInLink = HTMLDoc.getElementById("signin-button-container")
If Not oSignInLink Is Nothing Then
oSignInLink.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End If
' Get the email field and the next button
Set oInputEmail = HTMLDoc.getElementById("username")
Set oInputPassword = HTMLDoc.getElementById("password")
' Click the button and wait
oInputEmail.Value = "xxxxxx#xxxxxx.net"
' Get the password field and the sign in button
Set oInputPassword = HTMLDoc.getElementById("password")
Set oInputSigninButton = HTMLDoc.getElementById("sign-in-button")
' Click the button and wait
oInputPassword.Value = "xxxxxxxx"
oInputSigninButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("0:00:05"))
Set AAOrder = Application.Workbooks.Open("Z:\Employee Folders\Employee\trackingnumber_sample_spreadsheet.xls")
Set AAws = AAws.Worksheets("PrimeOrdersWithNoFulfillmentRe")
Set InputSearchOrder = HTMLDoc.getElementById("sc-search-field")
'What I'm currently stuck on
InputSearchOrder.Value = "001-7163923-7572632"
Set InputSearchButton = HTMLDoc.getElementsByClassName("sc-search-button")(0)
InputSearchButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
'Was able to add this snippet, but I'm getting an error 13, most likely with
'my e variable. I'm basically trying to do a loop within a loop, extracting 5
'pieces of data and sticking them back into their respective columns in the
'original Excel sheet. The problem comes when scraping the HTML. I'm basically
'trying to get text in the tables which have a few levels and it's frustrating
'me to no end.
With HTMLDoc
Set elems = HTMLDoc.getElementsByTagName("td")
For Each e In elems
If e.innerText Like "*1Z*" Then
Range("D2").Value = e.innerText
End If
Next e
End With
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub

Resources