VBA: It only shows ID at Ispect Element - excel

I'm a newbie at VBA programming, and I use it to build macro a in excel. Here is this webpage: http://www.eppraisal.com/home-values/property/14032-s-atlantic-ave-riverdale-il-60827-59143864/
I want to extract the value of $52,920 on the right upper corner. When I select the Inspect Element, this is the following:
<p id="eppraisalval">$52,920</p>
I checked the Page Source, but I couldn't find it. The closest thing I found:
<span id="eppraisal_val" class="valuation-estimates-price ajaxload" data-url="/home-values/property_lookup_eppraisal?a=14032%20S%20Atlantic%20Ave&z=60827&propid=59143864">loading...</span>
I tried the get getElementById("eppraisal_val") and getElementById("eppraisalval"), but none of those worked.
How can I address my code to get that element?
Here is more from the inspect element window, aroun that code:
<span id="eppraisal_val" class="valuation-estimates-price ajaxload" data-url="/home-values/property_lookup_eppraisal?a=14032%20S%20Atlantic%20Ave&z=60827&propid=59143864"><p id="eppraisalval">$52,920</p><p class="main-page-description-small valuation_details" style="display:none;margin-top:5px">Low: $44,982 <br>High: $60,858</p></span>
Here is the shorter version of code I tried:
Sub macroID()
On Error Resume Next
Dim ie As Object
Set ie = CreateObject("internetexplorer.application")
Dim doc As HTMLDocument
Dim valuation As String
Dim val1 As String, val2 As String
Set doc = ie.Document
ie.Visible = True
ie.navigate "http://www.eppraisal.com/home-values/property/14032-s-atlantic-ave-riverdale-il-60827-59143864/"
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.Document
valuation = doc.getElementById("eppraisalval")
Cells(1, 4).Value = valuation
Application.Wait (Now + TimeValue("0:00:03"))
End Sub

The issue with scraping from that URL is that there is a verification page before hand which requires you to confirm you "are not a robot" which makes it hard to scrape anything from it.
If you manually do this first it may save this in your cache and then allow you to run macros freely to scrape the website however you'd have to try this out.
In the meantime, the only issue I could see with your code is that you haven't included .innertext after .getElementById("eppraisalval"). The valuation line should look like this:
valuation = doc.getElementById("eppraisalval").innerText

Related

Scraping table behind login wall

I am struggling to get the right piece of code to scrape a table that is being a password protected website into an excel workbook. I have been able to get all of the code to work up to the scraping of the table part. When I run the code, it opens IE, logins in but then errors out (91: Object variable or WITH block variable not set). The code is below:
Private Sub CommandButton3_Click()
Declare variables
Dim IE As Object
Dim Doc As HTMLDocument
Dim HTMLTable As Object
Dim TableRow As Object
Dim TableCell As Object
Dim myRow As Long
'Create a new instance of Internet Explorer
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
'Navigate to the website
IE.Navigate "https://www.myfueltanksolutions.com/validate.asp"
'Wait for the page to finish loading
Do While IE.ReadyState <> 4
DoEvents
Loop
'Set the document object
Set Doc = IE.Document
'Fill in the security boxes
Doc.all("CompanyID").Value = "ID"
Doc.all("UserId").Value = "Username"
Doc.all("Password").Value = "Password"
'Click the submit button
Doc.all("btnSubmit").Click
'Wait for the page to finish loading
Do Until IE.ReadyState = READYSTATE_COMPLETE
DoEvents
Loop
'Set the HTMLTable object
Set HTMLTable = Doc.getElementById("RecentInventorylistform")
'Loop through each row in the table
For Each TableRow In HTMLTable.getElementsByTagName("tr")
'Loop through each cell in the row
For Each TableCell In TableRow.getElementsByTagName("td")
'Write the table cell value to the worksheet
Worksheets("Sheet1").Range("A5").Offset(myRow, 0).Value = TableCell.innerText
myRow = myRow + 1
Next TableCell
Next TableRow
Do Until IE.ReadyState = READYSTATE_COMPLETE
DoEvents
Loop
'Log out and close website
IE.Navigate ("https://www.myfueltanksolutions.com/signout.asp?action=rememberlogin")
IE.Quit
End Sub
I have included the HTML code of the table I am trying to scrape on the re-directed page after login.
I wont be tired to told it again and again and again and ... ;-)
Don't work with the IE anymore. MS is actively phasing it out!
But for explanation:
I'am sure, this is the code fragment which don't do what you expect:
...
...
'Wait for the page to finish loading
Do Until IE.ReadyState = READYSTATE_COMPLETE
DoEvents
Loop
'Set the HTMLTable object
Set HTMLTable = Doc.getElementById("RecentInventorylistform")
...
...
Waiting for READYSTATE_COMPLETE doesn't work here (for which reasons ever). So the code will go on without a stop and doesn't load the new content. The use of getElementByID() ends up in the named error then because there is no element with that id.
Excursus for some get-methods of the DOM (Document Object Model):
The methods getElementsByTagName() and getElementsByClassName() will build a node collection which contains all elements with the given criterion. If you build a collection like that with getElementsByTagName("a") you get a collection with all anchor tags. Every element of the collection can be called with it's index like in an array. If you want to know how many elements are in a collection like that you can read the attribute length. If there is no element you ask for, in our example a-tags, the length will be 0. But the collection was build so you have an object.
The get-methods which build a collection have an s for plural in ...Elements... But getElementByID() has no s because an id can only be once in a html document. No collection needed here. The method getElementByID() always try to buld an object from the asked criterion. If there is no element like that you will get the error that there is no object.
How to solve the issue:
We must change the termination criterion and the body of the loop. We must ask again and again if the element with the wanted id is present. To do that we must use the given line:
Set HTMLTable = Doc.getElementById("RecentInventorylistform")
Like I said before there will be raising an error if it is not present. That's right. But with On Error Resume Next we can ignore any error in the code.
Attention!
Only use this in specific situations and switch back to error handling with On Error GoTo 0 after the critical part of code.
Replace the code I posted above in this answer with the following one:
(To avoid endless loops it is recommended to use a time out mechanism too. But I will keep it simple here.)
Do
On Error Resume Next
Set HTMLTable = Doc.getElementById("RecentInventorylistform")
On Error GoTo 0
Loop While HTMLTable Is Nothing

Excel VBA - Web Scraping - Get value in HTML Table cell

I am trying to create a macro that scrapes a cargo tracking website.
But I have to create 4 such macros as each airline has a different website.
I am new to VBA and web scraping.
I have put together a code that works for 1 website. But when I tried to replicate it for another one, I am stuck in the loop. I think it maybe how I am referring to the element, but like I said, I am new to VBA and have no clue about HTML.
I am trying to get the "notified" value in the highlighted line from the image.
IMAGE:"notified" text to be extracted
Below is the code I have written so far that gets stuck in the loop.
Any help with this would be appreciated.
Sub FlightStat_AF()
Dim url As String
Dim ie As Object
Dim nodeTable As Object
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-92366691"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We can do that in a loop to match the point the information is available
Do
On Error Resume Next
Set nodeTable = ie.document.getElementByClassName("block-whisper")
On Error GoTo 0
Loop Until Not nodeTable Is Nothing
'Get the status from the table
MsgBox Trim(nodeTable.getElementsByClassName("fs-12 body-font-bold").innerText)
'Clean up
ie.Quit
Set ie = Nothing
Set nodeTable = Nothing
End Sub
Some basics:
For simple accesses, like the present ones, you can use the get methods of the DOM (Document Object Model). But there is an important difference between getElementByID() and getElementsByClassName() / getElementsByTagName().
getElementByID() searches for the unique ID of a html tag. This is written as the ID attribute to html tags. If the html standard is kept by the page, there is only one element with this unique ID. That's the reason why the method begins with getElement.
If the ID is not found when using the method, VBA throws a runtime error. Therefore the call is encapsulated in the loop from the other answer from me, into switching off and on again the error handling. But in the page from this question there is no ID for the html area in question.
Instead, the required element can be accessed directly. You tried the access with getElementsByClassName(). That's right. But here comes the difference to getElementByID().
getElementsByClassName() and getElementsByTagName() begin with getElements. Thats plural because there can be as many elements with the same class or tag name as you want. This both methods create a html node collection. All html elements with the asked class or tag name will be listet in those collections.
All elements have an index, just like an array. The indexes start at 0. To access a particular element, the desired index must be specified. The two class names fs-12 body-font-bold (class names are seperated by spaces, you can also build a node collection by using only one class name) deliver 2 html elements to the node collection. You want the second one so you must use the index 1.
This is the VBA code for the asked page by using the IE:
Sub FlightStat_AF()
Dim url As String
Dim ie As Object
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-92366691"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MsgBox Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
'Clean up
ie.Quit
Set ie = Nothing
End Sub
Edit: Sub as function
This sub to test the function:
Sub testFunction()
Dim flightStatAfResult As String
flightStatAfResult = FlightStat_AF("057-92366691")
MsgBox flightStatAfResult
End Sub
This is the sub as function:
Function FlightStat_AF(cargoNo As String) As String
Dim url As String
Dim ie As Object
Dim result As String
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/" & cargoNo
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
result = Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
'Clean up
ie.Quit
Set ie = Nothing
'Return value of the function
FlightStat_AF = result
End Function

Filling Internet Explorer Forms from Excel

I want to fill information onto the site 'select picker' section on the left side of the picture.
SITE
There is an element name 'picker' but it won't work.
The search bar is being completed using jQuery auto-complete which submits a form that navigates to an associate from the auto-complete box has been selected
I tried
IE.Document.All("picker").Value = "Testing"
IE.Document.GetElementByID("pickerCtrl.currentPicker").Value = "Testing"
IE.Document.GetElementsByClassName("input-group input-group-sm").Item(0).Value = "Testing"
PS.
Is there way to just copy and paste the row at the current position where cursor is positioning at?
I can sendkeys tab 18 times to get to that specific search bar but do not know how to insert the entire row from my Excel sheet.
eg. IE.Document.all("picker").Value = ThisWorkbook.Sheets("Main_1").Range("B" & intRow).Value
I am completely new to this.
Try to make a test with example below may help to solve the issue.
Sub demo1()
Dim i As Long
Dim URL As String
Dim IE As Object
Dim HWNDSrc As Long
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "D:\Backup20190913\tests\367.html"
IE.Navigate URL
Do While IE.ReadyState = 4: DoEvents: Loop
Do Until IE.ReadyState = 4: DoEvents: Loop
IE.Document.querySelector("[name='picker']").Value = "test value"
Stop
Set IE = Nothing
End Sub
HTML
<input class="form-control ng-valid ng-isolate-scope ng-touched ng-dirty" ng-model="pickerCtrl.currentpicker" name="picker" type="text" typeahead="" typeahead-source="pickerCtrl.getpickerStrings()" ng-disabled="pickerCtrl.errorLoadingPickers || pickerCtrl.noPickerstrings" ng-hide="pickerCtrl.disableTypeahead" autocomplete="off">
Output:

IE click on a button that has no link associated to it (using Excel VBA)

I want to click on a "button" on a web page, but my problem is that this button doesn't seem to have link attach to it. By the way I'm phrasing this you can see I'm not familiar at all with the language of web browser.
Anyway I most use internet explorer and here's the code I have so far
Sub click_button_no_hlink()
Dim i As Long
Dim IE As Object
Dim Doc As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application") 'create IE instance
IE.Visible = True
IE.Navigate "https://apex.xyz.qc.ca/apex/prd1/f?p=135:LOGIN_DESKTOP::::::" ' Adress of web page
Do While IE.Busy 'loading page
Application.Wait DateAdd("s", 1, Now)
Loop
'-------------Usually I would do something like that and it would works well.-----------------
Set link = IE.document.getElementsByTagName("a")
For Each l In link
a = l.innerText
If l.innerText = "Créer" Then '
l.Click
Exit For
End If
Next
'-------------That method works fine on other type of hlink to the page by the way------------
'-----------------------I also tried various methods around this with no luck------------------------------
Set objCollection = IE.document.getElementsByClassName("rc-content-buttons") 'recherche du bouton "Créer"
'---------------------------------
'--------------------or this also doesn't work-------------------------------
For Each btn In IE.document.getElementsByClassName("rc-content-buttons")
If btn.getAttribute("id") = "B91938241817236808" Then
btn.Click
Exit For
End If
Next btn
End Sub
For sake of clarity here's the code of the web page around the "button" I'm trying to interact with.
I've made many research but I'm in a dead end right now. Any help will be greatly appreciate.
Tx in advance.
SOLVED FINAL CODE: With help from DOMENIC and QHARR and Yu Zhou
Sub click_button_no_hlink()
Dim i As Long
Dim IE As Object
Dim Doc As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application") 'create IE instance
IE.Visible = True
IE.Navigate "https://apex.xyz.qc.ca/apex/prd1/f?p=135:LOGIN_DESKTOP::::::" ' Adress of web page
While IE.Busy: DoEvents: Wend 'loading page
IE.document.querySelector("[value='Créer']").FireEvent "onclick" 'works like a charm
' IE.document.querySelector("div.rc-content-buttons input").Click
'also works but speaks less by itself when reading code
' IE.document.getElementById("B91938241817236808").Click
'also works
End Sub
It is within an input not a tag element so gathering a tags will not capture what you want. Using the id, as suggested in comments is one way to go. If you get element not found then check whether your element is within a parent frame or iframe which needs to be negotiated. General syntax for that is
ie.document.getElementsByTagName("frame")(appropriateIndexHere).contentDocument.getElementById("B91938241817236808")
ie.document.getElementsByTagName("iframe")(appropriateIndexHere).contentDocument.getElementById("B91938241817236808")
If the id is dynamic you can use the value attribute
ie.document.querySelector("[value='Créer']")
ie.document.getElementsByTagName("frame")(appropriateIndexHere).contentDocument.querySelector("[value='Créer']") 'etc
As there is an event, you may need to fire it.
ie.document.querySelector("[value='Créer']").FireEvent "onclick"
ie.document.getElementsByTagName("frame")(appropriateIndexHere).contentDocument.querySelector("[value='Créer']").FireEvent "onclick" 'etc
And use a proper page load wait. So this,
Do While IE.Busy 'loading page
Application.Wait DateAdd("s", 1, Now)
Loop
Should be
While ie.Busy Or ie.ReadyState <> 4: DoEvents:Wend
getElementsByClassName returns an array-like object of all child elements which have all of the given class names, so I think it should be IE.document.getElementsByClassName("rc-content-buttons")(0).Click if it is the first element with the classname.
You could also try: IE.document.querySelector("div.rc-content-buttons input").Click.
And I think IE.document.getElementById("B91938241817236808").Click is also right.

Error "Object variable or with block variable not set" when using getElementsByClassName

I am want to scrap from amazon some fields.
Atm I am using a link and my vba script returns me name and price.
For example:
I put the link into column A and get the other fields in the respective columns, f.ex.: http://www.amazon.com/GMC-Denali-Black-22-5-Inch-Medium/dp/B00FNVBS5C/ref=sr_1_1?s=outdoor-recreation&ie=UTF8&qid=1436768082&sr=1-1&keywords=bicycle
However, I would also like to have the product description.
Here is my current code:
Sub ScrapeAmz()
Dim Ie As New InternetExplorer
Dim WebURL
Dim Docx As HTMLDocument
Dim productDesc
Dim productTitle
Dim price
Dim RcdNum
Ie.Visible = False
For RcdNum = 2 To ThisWorkbook.Worksheets(1).Range("A65536").End(xlUp).Row
WebURL = ThisWorkbook.Worksheets(1).Range("A" & RcdNum)
Ie.Navigate2 WebURL
Do Until Ie.ReadyState = READYSTATE_COMPLETE
DoEvents
Loop
Set Docx = Ie.Document
productTitle = Docx.getElementById("productTitle").innerText
'productDesc = Docx.getElementsByClassName("productDescriptionWrapper")(0).innerText
price = Docx.getElementById("priceblock_ourprice").innerText
ThisWorkbook.Worksheets(1).Range("B" & RcdNum) = productTitle
'ThisWorkbook.Worksheets(1).Range("C" & RcdNum) = productDesc
ThisWorkbook.Worksheets(1).Range("D" & RcdNum) = price
Next
End Sub
I am trying to get the product description by using productDesc = Docx.getElementsByClassName("productDescriptionWrapper")(0).innerText.
However, I get an error.
Object variable or with block variable not set.
Any suggestion why my statement does not work?
I appreciate your replies!
I'm pretty sure your problem is being caused by attempting to access the document before it's completely loaded. You're just checking ie.ReadyState.
This is my understanding of the timeline for loading a page with an IE control.
Browser connects to page: ie.ReadyState = READYSTATE_COMPLETE. At this point, you can access ie.document without causing an error, but the document has only started loading.
Document fully loaded: ie.document.readyState = "complete"
(note that frames may still be loading and AJAX processing may still be occurring.)
So you really need to check for two events.
Do
If ie.ReadyState = READYSTATE_COMPLETE Then
If ie.document.readyState = "complete" Then Exit Do
End If
Application.Wait DateAdd("s", 1, Now)
Loop
edit: after actually looking at the page you're trying to scrape, it looks like the reason it's failing is because the content you're trying to get at is in an iframe. You need to go through the iframe before you can get to the content.
ie.document.window.frames("product-description-iframe").contentWindow.document.getElementsByClassName("productDescriptionWrapper").innerText

Resources