Login to web page and scraping data using VBA - excel

I'm coding a macro that will load a web page, login in it and get the datas from a table.
Thanks to several topics here I've done that:
Sub sail2()
Dim ie As Object
Set ie = New(InternetExplorer.Application")
ie.Visible = True
ie.Navigate "http://pfrwtekref/websail/"
With ie
Dim oLogin As Object, oPassword As Object
Set oLogin = .document.getElementById("txtPwd")(0)
Set oPassword = .document.getElementById("txtPwd")(0)
oLogin.Value = test
oPassword.Value = test
.document.forms(0).submit
End With
End Sub
But I have two type of errors "automation error" if I launch the macro and "error 462 - server not available"
Can anyone help me ?
Thanks for reading and have a good day !

The first issue seems to be that you have a mismatched comma on the third line; however, seeing as you're are complaining about an automation error I think that may be just a typo on this site.
I can't see your internal website so I'm just guessing but I suspect the next issue is that there is no element with ID "txtPwd". You can check to confirm by pressing Ctrl-Shift-C and then selecting the username and password entry boxes.
Finally, depending on how the site is set up your .document.forms(0).submit may not work. You may need to find the ID for the submit button class submit. Below is a function I created a while back for such a task:
Function logIn(userName As String, password As String) As Boolean
'This routine logs into the grade book using given credentials
Dim ie As New InternetExplorer
Dim doc As HTMLDocument
On Error GoTo loginFail
ie.Navigate "[website here]"
'ie.Visible = True
Do While ie.ReadyState <> READYSTATE_COMPLETE Or ie.Busy: DoEvents: Loop 'Wait server to respond
Set doc = ie.Document
doc.getElementsByName("u_name").Item(0).Value = userName 'These may be different for you
doc.getElementsByName("u_pass").Item(0).Value = password
doc.getElementsByClassName("btn").Item(0).Click
Do While ie.ReadyState <> READYSTATE_COMPLETE Or ie.Busy: DoEvents: Loop 'Wait server to respond
Set doc = ie.Document
'Add a check to confirm you aren't on the same page
ie.Quit
Set ie = Nothing
LogIn = True
Exit Function
loginFail:
MsgBox "There was an issue logging in. Please try again."
logIntoGradeBook = False
End Function
Note that the site I was dealing with was set up poorly and so I needed to switch to GetElementsByName and GetElementsByClassName to get access to what I needed. You may be fine with IDs.

Why don't you try to use the Excel PowerQuery to get your data from the tables you need? When your desired columns are done, you just click in close and load, select data model e now you can use a macro to perform a PivotTable whenever you load the document or just use a PivotTable and the data will be refreshed when you use the refresh data in data ribbon.

Related

Excel VBA - Web Scraping - Get value in HTML Table cell

I am trying to create a macro that scrapes a cargo tracking website.
But I have to create 4 such macros as each airline has a different website.
I am new to VBA and web scraping.
I have put together a code that works for 1 website. But when I tried to replicate it for another one, I am stuck in the loop. I think it maybe how I am referring to the element, but like I said, I am new to VBA and have no clue about HTML.
I am trying to get the "notified" value in the highlighted line from the image.
IMAGE:"notified" text to be extracted
Below is the code I have written so far that gets stuck in the loop.
Any help with this would be appreciated.
Sub FlightStat_AF()
Dim url As String
Dim ie As Object
Dim nodeTable As Object
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-92366691"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We can do that in a loop to match the point the information is available
Do
On Error Resume Next
Set nodeTable = ie.document.getElementByClassName("block-whisper")
On Error GoTo 0
Loop Until Not nodeTable Is Nothing
'Get the status from the table
MsgBox Trim(nodeTable.getElementsByClassName("fs-12 body-font-bold").innerText)
'Clean up
ie.Quit
Set ie = Nothing
Set nodeTable = Nothing
End Sub
Some basics:
For simple accesses, like the present ones, you can use the get methods of the DOM (Document Object Model). But there is an important difference between getElementByID() and getElementsByClassName() / getElementsByTagName().
getElementByID() searches for the unique ID of a html tag. This is written as the ID attribute to html tags. If the html standard is kept by the page, there is only one element with this unique ID. That's the reason why the method begins with getElement.
If the ID is not found when using the method, VBA throws a runtime error. Therefore the call is encapsulated in the loop from the other answer from me, into switching off and on again the error handling. But in the page from this question there is no ID for the html area in question.
Instead, the required element can be accessed directly. You tried the access with getElementsByClassName(). That's right. But here comes the difference to getElementByID().
getElementsByClassName() and getElementsByTagName() begin with getElements. Thats plural because there can be as many elements with the same class or tag name as you want. This both methods create a html node collection. All html elements with the asked class or tag name will be listet in those collections.
All elements have an index, just like an array. The indexes start at 0. To access a particular element, the desired index must be specified. The two class names fs-12 body-font-bold (class names are seperated by spaces, you can also build a node collection by using only one class name) deliver 2 html elements to the node collection. You want the second one so you must use the index 1.
This is the VBA code for the asked page by using the IE:
Sub FlightStat_AF()
Dim url As String
Dim ie As Object
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/057-92366691"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
MsgBox Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
'Clean up
ie.Quit
Set ie = Nothing
End Sub
Edit: Sub as function
This sub to test the function:
Sub testFunction()
Dim flightStatAfResult As String
flightStatAfResult = FlightStat_AF("057-92366691")
MsgBox flightStatAfResult
End Sub
This is the sub as function:
Function FlightStat_AF(cargoNo As String) As String
Dim url As String
Dim ie As Object
Dim result As String
'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
url = "https://www.afklcargo.com/mycargo/shipment/detail/" & cargoNo
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.navigate url
Do Until ie.readyState = 4: DoEvents: Loop
'Wait to load dynamic content after IE reports it's ready
'We do that with a fix manual break of a few seconds
'because the whole page will be "reload"
'The last three values are hours, minutes, seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get the status from the table
result = Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
'Clean up
ie.Quit
Set ie = Nothing
'Return value of the function
FlightStat_AF = result
End Function

Filling web form fields but web page unable to detect text

I'm filling a web form using VBA, and I am able to fill text in the inputbox, but the webpage still is unable to detect the text and shows an error:
"Error: Required Field - Please provide an answer"
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True
URL = "https://npc.collegeboard.org/app/dartmouth/start"
objIE.Navigate URL
objIE.Document.getElementById("student.firstName").Focus
objIE.Document.getElementById("student.firstName").Value = "Tom"
Looks like theres some AngularJS running in the background, and it can't detect text fed in my VBA. Any help would be highly appreciated.
First of all after objIE.Navigate URL you should wait until the website is fully loaded and the IE is ready. This is done with the following:
objIE.Navigate URL 'this needs some time but VBA will continue excecuting the next statement qickly
Const READYSTATE_COMPLETE As Integer = 4
Do While objIE.Busy Or objIE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
'now IE is ready and the page is loaded.
But it might be that some JavaScript is not ready yet and this is not recognized by objIE.Busy Or objIE.ReadyState. So you can do a workaround:
Dim Obj As Object
Do While Obj Is Nothing
On Error Resume Next
Set Obj = objIE.Document.getElementById("student.firstName")
On Error GoTo 0
Loop
'now `student.firstName` is accessible, and probably all the other fields are too.
This will try to access the field student.firstName if it is not there it will error. We suppress the error message using On Error Resume Next and jump back to TryAgain until it was found.
Note that this has one disadvantage: If there is a problem loading this site it will get stuck in this loop. So I recommend to get a timed cancel criterium like if this takes more than a minute cancel it and throw a error message.
Something like the following should work:
Option Explicit
Sub test()
Dim objIE As Object
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True
Dim URL As String
URL = "https://npc.collegeboard.org/app/dartmouth/start"
objIE.Navigate URL
Const READYSTATE_COMPLETE As Integer = 4
Do While objIE.Busy Or objIE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Dim Obj As Object
Do While Obj Is Nothing
On Error Resume Next
Set Obj = objIE.Document.getElementById("student.firstName")
On Error GoTo 0
Loop
objIE.Document.getElementById("student.firstName").Focus
objIE.Document.getElementById("student.firstName").Value = "Tom"
End Sub

Engage with widget on IE by VBA

I haven't really tried working on this before, so I have no idea what I am doing at the moment. I have limited knowledge of html so not sure whether I am doing right. Basically what I aim to do is opening the Internet explorer by macro, changing some elements based id and click submit button on the website to show the data. Then I need keep working on the next step.
As you can see from the code I was trying to engage with the widget on IE by id number from html codes.
Sub Automate_IE_Enter_Data()
'This will load a webpage in IE
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objbutton As Object
'Create InternetExplorer Object
Set IE = CreateObject("InternetExplorer.Application")
'Set IE.Visible = True to make IE visible, or False for IE to run in the background
IE.Visible = True
'Define URL
URL = "http://cfpsg1/plant/Reports/ScrapReport.aspx"
'Navigate to URL
IE.Navigate URL
' Statusbar let's user know website is loading
Application.StatusBar = URL & " is loading. Please wait..."
' Wait while IE loading...
'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertantly skipping over the second loop)
Do While IE.ReadyState = 4: DoEvents: Loop
'Webpage Loaded
Application.StatusBar = URL & " Loaded"
IE.Document.getelementbyid("1stGroupBy").Value = "3"
'Find & Fill Out Input Box
IE.Document.getelementbyid("PageContent_uxStartDate").Value = "06/21/2019"
IE.Document.getelementbyid("PageContent_uxEndDate").Value = "06/21/2019"
Set objbutton = IE.Document.getelementbyid("PageContent_btnQuery")
objbutton.Focus
objbutton.Click
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing
End Sub
First thing first the webpage popped up but nothing changed of widgets besides an
error message "method 'Document' of object 'IWebBrowser 2' failed"
show on IE.Document.getelementbyid("1stGroupBy").Value = "3" row.
You are trying to interact with a dropdown so you want syntax such as
IE.Document.querySelector("[value='3']").Selected = True
You could also use
IE.Document.querySelector("#1stGroupBy").SelectedIndex = 2 'change to appropriate index
Error "method 'Document' of object 'IWebBrowser 2' failed" may be due to Integrity level
You can try below mentioned code by changing Integrity level as Medium
Dim IE As InternetExplorer
Set IE = New InternetExplorerMedium
IE.Visible = True
URL = "http://cfpsg1/plant/Reports/ScrapReport.aspx"
IE.Navigate URL
Do While IE.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
and try other line of codes to avoid this issue also please refer Here
Please add ref Microsoft Internet Controls and Microsoft HTML Object Library based on your req

Run-Time error '91': Object variable or With block variable not set - works once then won't work again

I get the error mentioned in the title every time I try to run my VBA code. Error is appearing on this line:
Set Button_Top_Result = IE.document.getElementById_
("javascript:gx.evt.execEvt('EUSRACNAM.CLICK.0001',this);").href
Also did some research online and find this alternative code and tried it as well.
For Each ele In IE.document.getElementsByTagName("a")
If InStr(ele.href, "javascript:gx.evt.execEvt('EUSRACNAM.CLICK.0001',this);") > 0 Then IE.navigate_
ele.href: Exit For
Next
This line worked once, and now no longer works.
VBA Novice here, so this might be a simple solution but I am stumped - working on navigating through the company website and wrote line to click on hyperlink to navigate me to this next page. It worked the first time, and then won't work and spits out the above error. Really stumped here, already did some digging online and can't figure out where my error is.
Here is the html part of the web page
https://i.stack.imgur.com/kgJOo.png
'''Option Explicit
Sub GetMasterDetailKeyedDataTest2()
'The goal of this macro is to quickly and conveniently update BNA Depreciation by automating the steps necessary to complete the process.
'Dimensions identfy things I will define later to use.
Dim name As String
Dim IE As InternetExplorerMedium
Dim store As String
Dim client As String
Dim URL As String
Dim username
Dim Password
Dim Button_Next
Dim Button_Login
Dim Button_OnDemandReporting
Dim Search_Bar
Dim Button_Mag_Glass
Dim Button_Top_Result As Action
'Line below bypasses login if user is already logged in
'On Error Resume Next
store = Workbooks("Learning").Sheets("Sheet1").Range("A2")
client = Workbooks("Learning").Sheets("Sheet1").Range("B2")
'Abbreviates Internet Explorer. Note I need the correct references enabled in tools in order to run a web query using this name.
Set IE = New InternetExplorerMedium
'Define URL
URL = "Company website"
'make IE browser visible (False would allow IE to run in the background)
'Once program is working I will want to turn this off so that the user doesn't see the webbrowser.
IE.Visible = True
'Navigate to Login page
IE.navigate URL
'This loop prevents Excel from continuing the code
Do While IE.Busy Or IE.readyState <> 4
DoEvents
Loop
'These next four steps navigate through the login
Set username = IE.document.getElementById("username") 'id of the username control (HTML Control)
username.Value = "username"
Set Button_Next = IE.document.getElementById("next") 'id of the button control (HTML Control)
Button_Next.Click
Set Password = IE.document.getElementById("password") 'id of the password control (HTML Control)
Password.Value = "password"
Set Button_Login = IE.document.getElementById("submit") 'id of the button control (HTML Control)
Button_Login.Click
Do While IE.Busy Or IE.readyState <> 4
DoEvents
Loop
'Connects to OnDemand Reporting
Set Button_OnDemandReporting = IE.document.getElementById("IMAGE1_0004")
Button_OnDemandReporting.Click
Do While IE.Busy Or IE.readyState <> 4
DoEvents
Loop
Set Search_Bar = IE.document.getElementById("vNAME")
Search_Bar.Value = Workbooks("Learning").Sheets("Sheet1").Range("B2")
Set Button_Mag_Glass = IE.document.getElementById("IMAGE1")
Button_Mag_Glass.Click
Set Button_Top_Result = IE.document.getElementById("javascript:gx.evt.execEvt('EUSRACNAM.CLICK.0001',this);").href
Button_Top_Result.Click
End Sub
'''
I expect the last step to navigate me to the next part of the web page, but instead I keep getting the above error.

Using Excel VBA to automate form filling in Internet Explorer

I want to take values from an excel sheet and store them in an array. I then want to take the values from the array and use them to fill the web form.
I have managed to store the values in the array and I have managed to get VBA to open Internet Explorer (IE)
The code runs and no errors appear, but the text fields are not being populated, nor is the button being clicked
(The debugger points to [While .Busy] as the error source, located in the WITH block)
How do I go about filling the form (that has a total of 3 text boxes to fill)?
There is also a drop down menu that I need to choose a value from, but I need to fill the text boxes prior to moving on to that part of the task.
Sub CONNECT_TO_IE()
the_start:
Dim ie As Object
Dim objElement As Object
Dim objCollection As Object
acct = GET_CLIENT_NAME()
name = GET_CODE()
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate ("<<my_website>>")
ie.FullScreen = False
On Error Resume Next
Do
DoEvents
If Err.Number <> 0 Then
ie.Quit
Set ie = Nothing
GoTo the_start:
End If
Loop Until ie.readystate = 4
Application.Wait Now + TimeValue("00:00:10")
ie.Document.getElementbyid("<<field_1>>").Value = "PPP"
ie.Document.getElementbyid("<<field_2>>").Value = "PPP"
ie.Document.getElementbyid("<<field_3>>").Click
Set ie = Nothing
End Sub
UPDATE: Turns out the reason this wasn't working is because there are some settings in the HTML of the site that do not allow for the automation to occur, so any code versions I had were correct but they were doomed to fail. So you were correct in that regard #TimWilliams.
I know this because the website I was trying to access is on a secure server/virtual machine. I edited the code to fill in the google search bar and it did not work on the virtual machine however when I ran the same code locally, it worked fine.

Resources