Replacing IE Bits with Edge in VBA - excel

To prepare for the eventual 'going away' of IE11, I've been trying to figure out how to replace a couple parts of my code. One involves launching IE and using that browser to scrape some pages. Is there an equivalent way to do the below in Edge? I don't see a way to add a reference to the Edge libraries like I did with 'Microsoft Internet Objects' and IE11.
Dim ie As InternetExplorerMedium: Set ie = New InternetExplorerMedium
Dim html As HTMLDocument
With ie
.Visible = False
.Navigate website 'string that's created above this code
End With
Do While ie.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Application.Wait Now + #12:00:10 AM#
Set html = ie.Document
Thanks everyone for your help.

Ok, a few explanations. I am writing these as a reply so as not to have to split them into several comments.
Does Edge work instead of IE to do web scraping with VBA?
It does not work directly. The reason is that IE has a COM interface (Wikipedia: Component Object Model). No other browser has this interface. Not even Edge.
But for Edge there is also a web driver for Selenium. Even provided directly by MS.
Another alternative - xhr
Since you can't use Selenium because you don't have admin rights, there might be the possibility to use xhr (XML HTTP Request). However, in order to make a statement on this, we would have to know the page that you want to scrape.
Xhr can be used directly from VBA because it does not use a browser. The big limitation is that only static content can be processed. No JavaScript is executed, so nothing is reloaded or generated dynamically in any other way. On the other hand, this option is much faster than browser solutions. Often, a static file provided by the web server is sufficient. This can be an HTML file, a JSON or another data exchange format.
There are many examples of using xhr with VBA here on SO. Take note of the possibility first as another approach. I can't explain the method exhaustively here, also because I don't know everything about it myself. But there are many ways to use it.
By the way
IE will finally be discontinued in June 2022 and will then also no longer be delivered with Windows. That's what I read on the German IT pages a few days ago. But there are already massive restrictions on the use of IE.

Related

Best Way to Check if Browser Page is Ready in VBA

I've seen three different ways to check if the page I'm navigating to is ready. As shown in the sample code below.
It seems to me Method 1 is the best, but hoping an expert out there can tell otherwise or even better... provide the right way to do it if there is something different.
Here's the sample code
Sub OpenBrowser()
Dim vOBJBROWSER As Object
Set vOBJBROWSER = CreateObject("InternetExplorer.Application")
vOBJBROWSER.Navigate "http://stackoverflow.com"
'Method 1
Do While vOBJBROWSER.Busy Or vOBJBROWSER.ReadyState <> 4
DoEvents
Loop
'Method 2
Do While vOBJBROWSER.ReadyState < 4
DoEvents
Loop
'Method 3
Do
Loop Until vOBJBROWSER.ReadyState = READYSTATE_COMPLETE
vOBJBROWSER.Visible = True
End Sub
The IE browser is going to make you really hate life in the long run.
Just like any browser'ed solution in webs scraping, you only need the browser, if you cant figure out what the resource is you're trying to load.
Consider all the over-head, javascript, CSS, potential tracking cookies, that accompany using a browser.
Now if you know what you want, and see in Chrome Dev Tools how it loads - you can use VBA's HTTP request libraries and you'll have a much better time.
The pro to using a HTTP request is that even it's a stream or chunked, you can control and easily measure when the message is done. A web page you'll always be stuck trying to figure out what the status code is, and sub frames, and all kinds of crap.
Highly recommend, channeling the frustration of IE automation into a learning experience with HTTP and chrome dev tools. You will 100% be less likely to smash your keyboard.

Screenshotting Google Maps and Pasting into Excel Document VBA [duplicate]

This question already has answers here:
is it possible to display a Google Earth map INSIDE Excel?
(3 answers)
Closed 4 years ago.
I have a code that already searches for the latitude and longitude and pastes to my worksheet, which works perfectly. I'm looking for a way to take that latitude and longitude, load google maps, and either take a screenshot of the google maps page or embed the map into Excel.
In my code below I have a code that already loads google maps for any input address, but I do not know how to either take the screenshot of the map (preferably without the input information on the side of the page) or embed the map into Excel. The extra code at the bottom is for a request/response from a USGS website that pulls official seismic information for a location, but should not effect the top part of the code.
Please note that I want this to just be a static screenshot of the map if possible. I do not want to install Google Earth on multiple desktops to be able to embed an interactive map into the worksheet if at all possible.
Option Explicit
Public Sub Seismicgrab()
Dim browser As New ChromeDriver
Dim URL As String
Dim ws As Object
Dim xmlhttp As New MSXML2.XMLHTTP60
browser.Get "http://www.google.com/maps?q=" & Range("H13").Value
browser.Wait 5000
Cells(19, 13).Value = browser.URL
browser.Close
URL = Range("M24").Value
xmlhttp.Open "GET", URL, False
xmlhttp.Send
Worksheets("Title").Range("M25").Value = xmlhttp.responseText
End Sub
You can use the TakeScreenshot method of the object
browser.TakeScreenshot.SaveAs ".....jpg" '<== put your path and file name here
For more flexibility e.g. cropping consider switching languages and using any of these methods:
How to capture the screenshot of a specific element rather than entire page using Selenium Webdriver?
Additionally, there are ways I believe with standard VBA and API calls to take a screenshot and then crop an image.

Access web page body text using VBA & Selenium

I am trying to convert an Excel macro that currently uses Internet Explorer and use the following line of code to extract the web page’s <body> text
x = .Document.DocumentElement.InnerText
Using the Selenium demo, I am able to produce a jpg of the page with Chrome & IE, but Firefox just loads a blank page and IE64 & Edge don’t work on Windows 10.
I have been unable to find the proper VBA command with Selenium to copy the body text to variable ”x”. I only want to read it.
I am trying to do this to make my macro browser independent.
The macro is for my use only.
Jim
You are not making it browser agnostic. You are simply widening the choice of browser to those supported via selenium basic. This brings some problems of its own which you are noticing.
Folders containing the drivers must be on the environmental path or the path passed to selenium webdriver as an argument.
You should use the latest Chrome browser and Chrome driver
You cannot use the latest FireFox browser and driver. It is not supported. I think you need FF v.46.0.1.
If using IE then zoom must be to 100%.
I suggest browsing the issues pages of Github for further known issues
Heuristically, I have heard some banter about problems with Windows 10 and Selenium Basic - would be interested to know if anyone has got this working as I am not on that version.
Review the examples.xlsm provided by selenium basic GitHub site to see which other browsers are supported (e.g. Opera, PhantomJS, FirefoxLight,CEF).
With Chrome you can get the body text with this:
Option Explicit
Public Sub GetInfo()
Dim d As WebDriver, s As String
Set d = New ChromeDriver
Const URL = "https://www.neutrinoapi.com/api/api-examples/python/"
With d
.Start "Chrome"
.get URL
s = .FindElementByTag("body").Text
Debug.Print s
.Quit
End With
End Sub
Other info: https://stackoverflow.com/a/52294259/6241235

IE automation through Excel vba

The problem that I'm having is quite simple. I'm opening a webpage, looking for the input box where I type some text and then hit a Search button. Once the new webpage is uploaded I gather all the info I need. My problem is in the time spent uploading the webpage. My gathering code doesn't work because the new webpage is still not loaded. I have the following code to wait for that:
Do While ie.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
where ie was set like this
Set ie = New InternetExplorer
Is there another code except the application.wait that I can use to fix this?
I've run into similar issues when attempting the same. The issue is that the ready state on the IE object can't always be trusted, or at the very least, it's not signaling what you think. For example it will let you know when each frame is ready, not the whole page. So if you don't actually need to see the web browser control, and you only care about sending and receiving data. My suggestion is to not bother rending the page in a web browser object, instead just send and receive data using a WinHttpRequest.
Tools>References>Microsoft WinHTTP Services
Using this, you can send and receive the HTML data directly. If your page uses URL parameters, you send a "GET" then parse the reply. Otherwise you will have to send a "PUT" and send the edited HTML (Basically take the blank form page you begin with and set all the values). When first using, it can be a bit tricky to get the formatting correct depending on the complexity of the page you are trying to automate. Find a good web dugging tool (such as Fiddler) so that you can see the HTML being sent to your target page.

JSP response content type Excel - file downloaded twice on IE8

When I set response content type as Excel, the Open/Save dialog is shown twice , just on IE8. It works fine on other browsers (tested on Chrome/Firefox/Opera).
The code for setting response content type is:
response.setContentType("application/vnd.ms-excel");
response.setHeader("Content-disposition","attachment;filename=abc.xls");
I searched for solutions/workarounds. Turning off Smartscreen didn't help.
Also, another suggestion was to wait for 5-10 sec before clicking Save/Open. That too didn't work.
What's the cause of this? Are there any IE specific workarounds?
It's a pain but IE8 is still widely used by the users.
This is just a guess, but it could have something to do with the way Office (used to) embed itself in IE with plugins.
A workaround might be putting it in a zip file before sending it over to the user.

Resources