Clicking an online excel download button with VBA - excel

My goal is to click the excel download button on this website. I keep getting 'Automation error. The interface is unknown' at my while loops.
Sub GetData()
Dim IE As InternetExplorerMedium
Dim HTMLDoc As HTMLDocument
Dim objElement As HTMLObjectElement
Set IE = New InternetExplorerMedium
With IE
.Visible = True
.Navigate "https://www.pimco.com/en-us/investments/mutual-funds"
Do While .readyState = 4: DoEvents: Loop
Do Until .readyState = 4: DoEvents: Loop
.document.getElementById("csvLink").Click
End With
Set IE = Nothing
End Sub

Here's a bunch of functions you can use to help clean up your code. https://stackoverflow.com/a/59721369/12685075
I wouldnt try and cram all that into a With clause.
I would be looking at splitting each step into it's own segment with functions.
Then checking for ready state and making sure the element exists first using error handling before you click it.
That being said, I'm going to say you can probably skip the IE explorer loading and get the link directly using XMLHTTP Requests. So open the page in chrome, turn on DevTools, Refresh the page, download the CSV, and start looking through the network requests.
You'll find one that represents the downloaded file, and it's likely a direct link you can then use with the parameters to let XMLHTTP skip the page stuff, and get the file everytime without worrying about the loading elements like CSS / formatting / fonts.

Some explanations why you run in trouble:
Don't use InternetExplorerMedium.
The problem here is that IE is opened twice. After the first opening, it is immediately closed again and the URL is loaded in another instance. But this instance is no longer assigned to the IE variable and cannot be referenced by the macro. You can observe this when you execute your macro. IE seems to twitch once.
The lines Do While .readyState = 4: DoEvents: Loop and Do Until .readyState = 4: DoEvents: Loop wait for the opposide.
The 4 says page is complete loaded. So you can use Do While .readyState <> 4: DoEvents: Loop or Do Until .readyState = 4: DoEvents: Loop. One of both loops is enough.
The page loads dynamic content after the IE reports complete.
For that reason you must break until that content is loaded. The simplest way to do this is a hard break. Look at that part in the code below.
You must trigger the download.
To do this, you need Sendkeys(). This is not a good thing, but can hardly be avoided here. I don't think there is a direct download link, as Peyter assumes, because I assume that the file for the download is only generated upon request based on the displayed data of the page. At least this is my experience with such downloads.
Please read the comments in the macro I wrote above the Sendkeys() line to find the downloaded file on your computer afterwards.
Here is the code that works:
Sub GetData()
Dim IE As Object
Set IE = CreateObject("internetexplorer.application")
IE.Visible = True
IE.Navigate "https://www.pimco.com/en-us/investments/mutual-funds"
Do Until IE.readyState = 4: DoEvents: Loop
'Manual break to load dynamic content after
'the IE reports the ready state 'complete' (4)
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 10))
'Now we can click the button
IE.document.getElementById("csvLink").Click
'Here you need sendkeys to trigger the save button
'Don't touch anything while the code runs
'Sendkeys will send the key combination in the brackets
'to the application which has the focus
'The file will be saved to your standard donload directory
'or to the download directory you placed in the IE settings
'if you did that
Application.SendKeys ("%{S}")
'Clean up
IE.Quit
Set IE = Nothing
End Sub

Related

Microsoft will shutdown IE starting June 15, will vba code that uses InternetExplorer and HTMLDocument still work after the shutdown date?

I have a vba program that uses InternetExplorer and HTMLDocument object. Since Microsoft has already announced that it will shutdown internet explorer starting June 15, 2022. I am just worried will the vba program that I did still works? Since I'm not sure if they will be shutting down IE completely.
Option Explicit
Sub automateIE()
Dim IE As InternetExplorer
Dim doc As HTMLDocument
Dim URL As String
Set IE = New InternetExplorer
Let URL = "https://www.simpleexcelvba.com/"
IE.Visible = True
IE.navigate URL
Do While IE.readyState <> READYSTATE_COMPLETE Or IE.Busy: DoEvents: Loop
Set doc = IE.Document
doc.getElementsByName("s").Item(0).Value = "Connect to SAP"
doc.getElementsByClassName("search-submit").Item(0).Click
End Sub
Yeah, this was a big concern for our team when this was announced a while back. We had over 50 VBA bots that used IE and migrated them over to Power Automate Desktop. Microsoft's new RPA software that works with most of the main browsers. The other option is to download selenium for vba (a little outdated but still works) or head over to python to use selenium which is a stronger solution anyway.

Replacing IE Bits with Edge in VBA

To prepare for the eventual 'going away' of IE11, I've been trying to figure out how to replace a couple parts of my code. One involves launching IE and using that browser to scrape some pages. Is there an equivalent way to do the below in Edge? I don't see a way to add a reference to the Edge libraries like I did with 'Microsoft Internet Objects' and IE11.
Dim ie As InternetExplorerMedium: Set ie = New InternetExplorerMedium
Dim html As HTMLDocument
With ie
.Visible = False
.Navigate website 'string that's created above this code
End With
Do While ie.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Application.Wait Now + #12:00:10 AM#
Set html = ie.Document
Thanks everyone for your help.
Ok, a few explanations. I am writing these as a reply so as not to have to split them into several comments.
Does Edge work instead of IE to do web scraping with VBA?
It does not work directly. The reason is that IE has a COM interface (Wikipedia: Component Object Model). No other browser has this interface. Not even Edge.
But for Edge there is also a web driver for Selenium. Even provided directly by MS.
Another alternative - xhr
Since you can't use Selenium because you don't have admin rights, there might be the possibility to use xhr (XML HTTP Request). However, in order to make a statement on this, we would have to know the page that you want to scrape.
Xhr can be used directly from VBA because it does not use a browser. The big limitation is that only static content can be processed. No JavaScript is executed, so nothing is reloaded or generated dynamically in any other way. On the other hand, this option is much faster than browser solutions. Often, a static file provided by the web server is sufficient. This can be an HTML file, a JSON or another data exchange format.
There are many examples of using xhr with VBA here on SO. Take note of the possibility first as another approach. I can't explain the method exhaustively here, also because I don't know everything about it myself. But there are many ways to use it.
By the way
IE will finally be discontinued in June 2022 and will then also no longer be delivered with Windows. That's what I read on the German IT pages a few days ago. But there are already massive restrictions on the use of IE.

Excel VBA website login stalls after "redirect"

I ultimately want to use Excel VBA with InternetExplorer to scrape data from a webpage.
The scenario requires login to one website page that then redirects to another page where another data entry is required to trigger the response containing the data that I ultimately want.
My code (redacted) successfully performs the initial login and reaches the redirected page, but it stalls before making the data entry on that page. I do not receive any error notification.
I haven't been able to find an answer to the stall, but I assume it is somehow related to the website page redirection. As a point of interest, if I omit the Application.Wait, the execution tries to use the ("name").Value following it in first phase of the login. If anyone can explain that also, I'd be very interested.
Any help or guidance would be greatly appreciated.
Dim IE As InternetExplorerMedium
Set IE = New InternetExplorerMedium
IE.Visible = True
IE.navigate "http://something/Login.aspx"
Do Until IE.readyState = 4: DoEvents: Loop
Do While IE.Busy: DoEvents: Loop
IE.document.getElementById("name").Value = "signin"
IE.document.getElementById("password").Value = "pword"
IE.document.getElementById("btnValidate").Click
Do While IE.Busy: DoEvents: Loop
Do Until IE.readyState = 4: DoEvents: Loop
'Now at redirected page
Application.Wait (Now + TimeValue("0:00:08"))
'This is where execution stalls
IE.document.getElementById("name").Value = "identification"
IE.document.getElementById("btnValidate").Click

VBA Excel Download webpage complete

I'm trying to download a complete webpage. In other words automate this process:
1- Open the webpage
2- Click on Save as
3- Select Complete
4- Close the webpage.
This is what I've got so far:
URL = "google.com" 'for TEST
Dim IE
Set IE = CreateObject("Internetexplorer.Application")
IE.Visible = False
IE.Navigate URL
Do
Loop While IE.Busy = True
Dim i
Dim Filename
i = 0
Filename = "C:\Test.htm"
IE.Document.ExecCommand "SaveAs", False, Filename
When I run the code in the last line a save file dialog appears. Is there any way to suppress this?
Any help would be most appreciated.
The Save As dialog cannot be suppressed:
The Save HTML Document dialog cannot be suppressed when calling this method from script.
It is also a modal dialog and you cannot automate the way to click the "Save" button. VBA execution pauses while waiting manual user input when faced with a dialog of this sort.
Rather than using the IE.Document.ExecCommand method, you could try to read the page's HTML and print that to a file using standard I/O functions.
Option Explicit
Sub SaveHTML()
Dim URL as String
Dim IE as Object
Dim i as Long
Dim FileName as String
Dim FF as Integer
URL = "http://google.com" 'for TEST
Filename = "C:\Test.htm"
Set IE = CreateObject("Internetexplorer.Application")
IE.Visible = True
IE.Navigate URL
Do
Loop While IE.Busy
'Creates a file as specified
' this will overwrite an existing file if already exists
CreateObject("Scripting.FileSystemObject").CreateTextFile FileName
FF = FreeFile
Open Filename For Output As #FF
With IE.Document.Body
Print #FF, .OuterHtml & .InnerHtml
End With
Close #FF
IE.Quit
Set IE = Nothing
End Sub
I am not sure whether this will give you exactly what you want, or not. There are other ways to get data from web and probably the best would be to get the raw HTML from an XMLHTTP request and print that to a file.
Of course, it is rarely the case that we actually need an entire web page in HTML format, so if you are looking to then scrape particular data from a web page, the XMLHTTP and DOM would be the best way to do this, and it's not necessary to save this to a file at all.
Or, you could use the Selenium wrapper to automate IE, which is much more robust than using the relatively few native methods to the InternetExplorer.Application class.
Note also that you are using a rather crude method of waiting for the web page to load (Loop While IE.Busy). While this may work sometimes, it may not be reliable. There are dozens of questions about how to do this properly here on SO, so I would refer you to the search feature here to tweak that code a little bit.

Excel vba to use an existing open website, instead of opening new browser to access the website, [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Using an IE browser with Visual Basic
I have a website that is updated daily. I need to retrieve the information from this website daily. Instead of opening up a new browser eg new internet explorer everyday, is it possible to use an already opened internet explorer to retrieve the information.
I don't have IE installed so I can't promise this will work, but give it a shot. Note that you'll need to set references to Microsoft Internet Controls and Microsoft HTML Object Library.
Function GetOpenIE() As SHDocVw.InternetExplorer
Dim ie As SHDocVw.InternetExplorer
Dim sw As SHDocVw.shellWindows
Set sw = New SHDocVw.shellWindows
For Each ie In sw
If TypeOf ie.Document Is HTMLDocument Then
Set GetOpenIE = ie
Exit Function
End If
Next ie
End Function

Resources