msxml2.xmlhttp IE 11 vs Chrome - excel

I have a VBA program which reads HTML contents, processes it and outputs the result onto an Excel spreadsheet.
The program runs without issue when the default browser is set to Chrome, however testing the program on a machine where IE 11 is set as the default browser results in a run-time error.
I believe that the issue relates to the following code:-
Dim oDom As Object: Set oDom = CreateObject("htmlFile")
With CreateObject("msxml2.xmlhttp")
.Open "GET", "http://bouch/jira/browse/MAX-1", False
.Send
oDom.body.innerHtml = .responseText
End With
with IE 11, I don't think that the msxml2.xmlhttp is returning anything (or whatever it does return results in subsequent parts of the code issuing the fault).
I've tried msxml2.xmlhttp.3.0 msxml2.xmlhttp.6.0 but to no avail.
Is there a simple answer?
Many thanks in advance.

Related

Extracting Data from URL VBA getting IE not suppoting

I have been using the following Excel VBA macro to bring back data from a website. It worked fine until a few days ago when the website stopped supporting IE. Of course the macro just fails now as there is no data on the webpage to bring back to Excel, Is there a way to have the "Get method" (MSXML2.XMLHTTP)
here is my Code
Public Sub GGGG()
Dim MSX As Object
Dim HTML As HTMLDocument
Dim URL As String
Dim UrlResponse As String
Dim N As Long
Dim sht1, sht2 As Worksheet
' On Error Resume Next
Set MSX = CreateObject("MSXML2.XMLHTTP")
Set HTML = New HTMLDocument
URL = "https://www.justdial.com/Agra/Yogi-General-Store-Opp-Eclave-Satiudum-Sadar-Bazaar/0562P5612-5612-120207212812-H5I2_BZDET"
With MSX
.Open "GET", URL, False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
UrlResponse = StrConv(.responseBody, vbUnicode)
End With
ActiveCell.Offset(0, 1) = UrlResponse
End Sub
I get response like
Error
An error occurred while processing your request.
Reference #97.ec8a2c31.1621136928.281f3ca8
Please anyone can support me how to get data when IE dose not support
I am not an expert in coding
Okay, try this to get the title and votes from that site using vba in combination with selenium.
Sub FetchInfo()
Dim driver As Object, oTitle As Object
Dim oVotes As Object
Set driver = CreateObject("Selenium.ChromeDriver")
driver.get "https://www.justdial.com/Agra/Yogi-General-Store-Opp-Eclave-Satiudum-Sadar-Bazaar/0562P5612-5612-120207212812-H5I2_BZDET"
Set oTitle = driver.FindElementByCss("span.item > span", Raise:=False, timeout:=10000)
Set oVotes = driver.FindElementByCss("span.rtngsval > span.votes", Raise:=False, timeout:=10000)
Debug.Print oTitle.Text, oVotes.Text
End Sub
When the webpage no longer support IE in future, you can try out web scrape using Google Chrome with new add-in installed, please see following link for the add-in installation adn how to write in VBA. However, it is in my opinion the most simple way to perform your work is to use Uipath free community version, it work for all type of web-browser.
VBA guideline:
https://www.wiseowl.co.uk/vba-macros/videos/vba-scrape-websites/web-scraping-selenium-chrome/
VBA library installation for Selenium:
https://code.google.com/archive/p/selenium-vba/downloads
You probably need to set the Feature Browser Emulation to zero as detailed by Daniel here:
Everything You Never Wanted to Know About the Access WebBrowser Control
That said, your URL fails even when opened in Edge Chromium, so the site may suffer from a general failure.

Can MSXML2.XMLHTTP be used with Chrome

I have been using the following Excel VBA macro to bring back data from a website. It worked fine until a few days ago when the website stopped supporting IE. Of course the macro just fails now as there is no data on the webpage to bring back to Excel, just a message saying, "Your browser, Internet Explorer, is no longer supported." Is there a way to have the "Get method" (MSXML2.XMLHTTP) use Chrome instead of IE to interact with the website? BTW, my default browser is already set to "Chrome".
Dim html_doc As HTMLDocument ' note: reference to Microsoft HTML Object Library must be set
Sub KS()
' Define product url
KS_url = "https://www.kingsoopers.com/p/r-w-knudsen-just-blueberry-juice/0007468210784"
' Collect data
Set html_doc = New HTMLDocument
Set xml_obj = CreateObject("MSXML2.XMLHTTP")
xml_obj.Open "GET", KS_url, False
xml_obj.send
html_doc.body.innerHTML = xml_obj.responseText
Set xml_obj = Nothing
KS_product = html_doc.getElementsByClassName("ProductDetails-header")(0).innerText
KS_price = "$" & html_doc.getElementsByClassName("kds-Price kds-Price--alternate mb-8")(1).Value
do Stuff
End Sub
The check for this is a basic server check on user agent. Tell it what it wants to "hear" by passing a supported browser in the UA header...(or technically, in this case, just saying the equivalent of: "Hi, I am not Internet Explorer".)
It can be as simple as xml.setRequestHeader "User-Agent", "Chrome". I said basic because you could even pass xml.setRequestHeader "User-Agent", "I am a unicorn", so it is likely an exclusion based list on the server for Internet Explorer.
Option Explicit
Public Sub KS()
Dim url As String
url = "https://www.kingsoopers.com/p/r-w-knudsen-just-blueberry-juice/0007468210784"
Dim html As MSHTML.HTMLDocument, xml As Object
Set html = New MSHTML.HTMLDocument
Set xml = CreateObject("MSXML2.XMLHTTP")
xml.Open "GET", url, False
xml.setRequestHeader "User-Agent", "Mozilla/5.0"
xml.send
html.body.innerHTML = xml.responseText
Debug.Print html.getElementsByClassName("ProductDetails-header")(0).innerText
Debug.Print "$" & html.getElementsByClassName("kds-Price kds-Price--alternate mb-8")(1).Value
Stop
End Sub
Compare that with adding no UA or adding xml.setRequestHeader "User-Agent", "MSIE".
Study the article here by Daniel Pineault and this paragraph:
Feature Browser Emulation
Also note my comment dated 2020-09-13.

WinHttpRequest fails with automation error but XMLHTTP60 works

I've been doing quite a bit of web scraping over the past year and at some point, for reasons I don't remember anymore, I decided to use the Microsoft WinHTTP Services version 5.1 library as my default solution when sending HTTP requests.
I've never had any problems with it and I have achieved anything I ever attempted to do as far as web scraping is concerned.
That is, until i tried the following:
Sub nse()
Dim req As New WinHttpRequest
Dim url As String, requestPayload As String
url = "https://www.niftyindices.com/Backpage.aspx/getHistoricaldatatabletoString"
requestPayload = "{'name':'NIFTY 50','startDate':'01-Feb-2020','endDate':'01-Feb-2020'}"
With req
.Open "POST", url, False
.setRequestHeader "Content-Type", "application/json; charset=UTF-8"
.send requestPayload
Debug.Print .responseText
End With
End Sub
The .send method fails with a
Run-time error -2147012894 (80072ee2) Automation error
Changing to Dim req As New MSXML2.XMLHTTP60 solves the issue completely.
What am I missing here? Could it be website specific somehow? Is there something in the inner workings of these 2 libraries I should know?
Any input would be appreciated.

Export HTML to text file with different results

I have two codes .. that are supposed to export the html file to text file
Sub Demo1()
Dim http As New XMLHTTP60
Dim html As New HTMLDocument
With http
.Open "GET", "https://www.google.com.eg/", False
.send
html.body.innerHTML = .responseText
WriteTxtFile html.body.innerHTML
End With
End Sub
Sub WriteTxtFile(ByVal aString As String, Optional ByVal filePath As String = "C:\Users\Future\Desktop\Output.txt")
Dim fso As Object
Dim fileout As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Set fileout = fso.CreateTextFile(filePath, True, True)
fileout.write aString
fileout.Close
End Sub
Sub Demo2()
Dim ie As Object
Dim f As Integer
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True
.navigate ("https://www.google.com.eg/")
Do: DoEvents: Loop Until .readyState = 4
f = FreeFile()
Open ThisWorkbook.Path & "\Sample.txt" For Output As #f
Print #f, .document.body.innerHTML
Close #f
.Quit
End With
End Sub
Both Demo1 and Demo2 are the codes .. and they resulted in "Sample.txt" and "Output.txt"
But I found those html documents are different results
Can you help me to clarify what is the right one .. and why they are different?
Thanks advanced for help
Xmlhttp does not provide all the rendered content of a webpage. Particularly anything rendered via JavaScript execution. Any scripts are not executed.
Internet Explorer on the other hand will render the page (provided the browser version and JavaScript syntax is supported. For example, you will run into problems with the ec6 - latest Ecmascript as this is not supported on legacy browsers. It is I believe on Edge for Windows 10. You can check compatibility tables to see what is and isn’t supported ) fully.
If you familiarize yourself with dev tools for your browser you can explore how different parts of a webpage are rendered. You can learn to debug scripts and see what changes are made to the DOM and page styling. Often a page will issue XHR requests to update content on a page for example. If you want to have a play look here.
So, I suspect that the first html document may have less content and a different overall DOM structure from the second on this basis.
To test for differences due to writing to text file methodology you need to compare Apples with Apples i.e use the same scraping access method and syntax to retrieve the page content before writing out.
Please provide the differences if you want a deeper explanation.
Exploring page updating:
Firefox Network Tab
Internet Explorer Network Inspector
Chrome Network Tab

Getting error in xmlhttp query but working fine with IE query

Sub Test(URL As Range)
For Each cell In URL
If InStr(1, cell.Value, "https://www.kimptonhotels", vbTextCompare) Then
Dim xml_obj As New XMLHTTP
xml_obj.Open "GET", cell, False
xml_obj.send
Dim htmldoc As New HTMLDocument
htmldoc.body.innerHTML = xml_obj.responseText ' I am trying to get some data from [this link][1] by VBA in Excel
Getting error on this line:
xml_obj.send
I am getting required data which I need by creating IE object method fine.
I tried to get via xmlhttp method but I got error on xml_obj.send
Error 2146697211 (800c005) The system cannot locate the resource specified
I tried searching google but couldn't find something similar.
I am asking this just out of curiosity, what is the problem and why I am getting error?
The xmlhttp code is working fine with some other https sites I checked.
Thanks

Resources