Good afternoon,
First off....I know using ie for anything isn't great since its not being supported anymore.
I have figured out how to scrape a table but I need to have the table data to be placed in cell A5. I tried adding .range("A5") to parts of the code but haven't been able to get it to work. Please see code below:
Private Sub CommandButton3_Click()
'Clear the range before scraping
ActiveSheet.Range("A5:k5000").ClearContents
'Navigating to webpage
Dim ie As Object
Dim url As String
url = "https://www.myfueltanksolutions.com/validate.asp"
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate url
Do While ie.Busy: DoEvents: Loop
Do Until ie.readyState = 4: DoEvents: Loop
'Login credentails and submit
Dim idoc As MSHTML.HTMLDocument
Set idoc = ie.document
idoc.all.CompanyID.Value = "CompanyID"
idoc.all.UserId.Value = "UserID"
idoc.all.Password.Value = "Password"
idoc.parentWindow.execScript "submitForm();"
Do While ie.Busy: DoEvents: Loop
Do Until ie.readyState = 4: DoEvents: Loop
'Scrapging table
Dim tbl As HTMLTable
Set tbl = ie.document.getElementById("RecentInventorylistform")
Dim rowcounter As Integer
Dim colcounter As Integer
rowcounter = 1
colcounter = 1
Dim tr As HTMLTableRow
Dim td As HTMLTableCell
Dim th
Dim mySh As Worksheet
Set mySh = ThisWorkbook.Sheets("Sheet1")
For Each tr In tbl.getElementsByTagname("tr")
'Loop thru table cells
For Each td In tr.getElementsByTagname("td")
mySh.Cells(rowcounter, colcounter).Value = td.innerText
colcounter = colcounter + 1
Next td
colcounter = 1
rowcounter = rowcounter + 1
Next tr
'Log out and close website
ie.navigate ("https://www.myfueltanksolutions.com/signout.asp?action=rememberlogin")
ie.Quit
'Last updated and message box at completion
Range("N1") = Now()
MsgBox "Data Imported Successfully. Press Ok to Continue."
End Sub
Thank you so much for the help!
Do you want to start from cell A5? If so, you should change the value in mySh.Cells(rowcounter, colcounter).Value. Cell A5 is Cells(5, 1), so you should start from Cells(5, 1). You can try to change the code like this:
Dim rowcounter As Integer
Dim colcounter As Integer
rowcounter = 5
colcounter = 1
Related
I am trying to get a table from a website into my excel sheet. Since the website has a log in and I need to click a few buttons to get to the table, I am using VBA.
The code I have so far is just a test, it is not the actual website that I am trying to log into.
So far, the code is able to launch the website and get the inner text from the table, but it only pastes it into a single cell. How can I paste the table by keeping the same formatting?
Sub test()
Set IE = CreateObject("InternetExplorer.application")
IE.Visible = True
IE.navigate ("https://www.w3schools.com/html/html_tables.asp")
Do
If IE.readyState = 4 Then
IE.Visible = True
Exit Do
Else
DoEvents
End If
Loop
'get data
Dim tbl As HTMLTable
Set tbl = IE.document.getElementById("customers")
Cells(1, 1) = tbl.innerText
End Sub
You may perform webscraping using the following code enhancement, it work perfectly :
Sub test()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.application")
IE.Visible = True
IE.navigate ("https://www.w3schools.com/html/html_tables.asp")
Do
If IE.readyState = 4 Then
IE.Visible = True
Exit Do
Else
DoEvents
End If
Loop
'get data
Dim tbl As HTMLTable
Dim class1 As IHTMLElement, rowText As IHTMLElement, item As IHTMLElement
Dim rowNum As Long, colNum As Long
Set class1 = IE.document.getElementById("customers").children(0)
rowNum = 0
For Each rowText In class1.children
rowNum = rowNum + 1
colNum = 0
For Each item In rowText.children
colNum = colNum + 1
Sheet1.Cells(rowNum, colNum).Value = item.innerText
Next
Next
End Sub
I am trying to get data from a website to my excel sheet but somehow I cannot navigate to the body of the table no matter what I do. Please see the website and the code below and tell me how can I get the latest values of 1Y, 2Y, ... , 10Y into my excel sheet. This is the code:
Option Explicit
Sub updatePKRV()
Dim ieobj As InternetExplorer
Dim iedoc As HTMLDocument
Dim htmlele As IHTMLElement
Dim HTMLRow As IHTMLElementCollection
Dim HTMLIT As IHTMLElement
Dim ws As Worksheet
Set ieobj = New InternetExplorer
ieobj.Visible = False
ieobj.navigate "https://fma.com.pk/index.php/pkrv/"
Do While ieobj.Busy = True Or ieobj.readyState <> READYSTATE_COMPLETE
Application.Wait Now + TimeValue("00:00:01")
Loop
Set iedoc = ieobj.document
Set htmlele = iedoc.getElementById("table_2")
'Set HTMLRow = htmlele.getElementsByTagName("tr")
Debug.Print htmlele.Children(0).textContent
End Sub
AFTER CHANGES
Option Explicit
Sub updatePKRV()
Dim ieobj As InternetExplorer
Dim iedoc As HTMLDocument
Dim htmlele As IHTMLElement
Dim HTMLRow As IHTMLElementCollection
Dim HTMLIT As IHTMLElement
Dim nodeList As Object, i As Long, arr()
Dim ws As Worksheet
Set ieobj = New InternetExplorer
ieobj.Visible = False
ieobj.navigate "https://fma.com.pk/index.php/pkrv/"
Do While ieobj.Busy = True Or ieobj.readyState <> READYSTATE_COMPLETE
Application.Wait Now + TimeValue("00:00:01")
Loop
Set iedoc = ieobj.document
Set htmlele = iedoc.getElementById("table_2")
Set nodeList = ieobj.document.querySelectorAll("#table_2 tr:nth-of-type(2) .column-date, #table_2 tr:nth-of-type(2) [class*=y]")
ReDim arr(1 To 11)
For i = 0 To 10
arr(i + 1) = nodeList.Item(i).innerText ''This is where is gets an error
Next
ActiveSheet.Cells(2, 1).Resize(1, UBound(arr, 2)) = arr
End Sub
You want the "header" first date column and then the first 10 of the years columns within the second row. You can use a css selector for that
#table_2 tr:nth-of-type(2) .column-date, #table_2 tr:nth-of-type(2) [class*=y]
This will retrieve a node list for the second row
tr:nth-of-type(2)
within the table with id table_2
#table_2
matching on the child with class column-date
.column-date
OR (,)
class that contains (*) the letter y (for year)
[class*=y]
Note:
I am matching on a single class of the multi-valued classes present.
The page is slow loading so you may need a timed loop to wait for elements to be fully loaded.
With that nodeList you want to go from 0 to 10 in order to get the first date field and the 10 first years.
Dim nodeList As Object, i As Long, arr()
Set nodeList = ie.document.querySelectorAll("#table_2 tr:nth-of-type(2) .column-date, #table_2 tr:nth-of-type(2) [class*=y]")
ReDim arr(1 To 11)
For i = 0 To 10
arr(i+1) = nodeList.item(i).innerText
Next
ActiveSheet.Cells(2,1).Resize(1, UBound(arr, 1)) = arr
Read about css selectors here: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors
Quick browser test of matches and output:
2021-03-05 update
#table_2 tbody tr:nth-of-type(1) .column-date, #table_2 tbody tr:nth-of-type(1) [class*=y]
To take first row within table body (i.e. exclude header row and obtain latest date)
I just discovered xml or xmlhttp and this is entirely new to me.
I am trying to create a macro wherein it would go through all the list of websites in column J starting at row 2 (header at row 1). Get the information that I want from each website, then display them in column K, which is right next to the websites where the information was taken from.
Column J has a list of websites, starting at J2. Let's say it would go all the way down to J10. From each website, there is a certain information I want to get, so the macro will visit the website at J2, get that information and paste it in K2, then visit the website in J3, paste that information in K3, and so on. I already have an existing list of website at column J, which also happens to be dynamic.
This is the current code that I have using IE that I want to convert into xml/xmlhttp.
Sub CommandButton1_Click()
Dim ie As Object
Dim lastrow As Integer
Dim i As Integer
Dim myURL As String
Dim sdd As String
Dim add As Variant
Dim html As Object
Dim mylinks As Object
Dim mylink As Object
Dim result As String
' Create InternetExplorer Object
Set ie = CreateObject("InternetExplorer.Application")
lastrow = Sheet1.Cells(Rows.Count, "J").End(xlUp).Row
For i = 2 To lastrow
myURL = Sheet1.Cells(i, "J").Value
' Hide InternetExplorer
ie.Visible = False
' URL to get data from
ie.navigate myURL
' Loop until page fully loads
Do While ie.readystate <> READYSTATE_COMPLETE
Loop
' Information i want to get from the URLs
sdd = ie.document.getelementsbyclassname("timeline-text")(0).innerText
' Format the result
add = Split(sdd, "$")
Range("K3") = add(1)
' Close InternetExplorer
ie.Quit
'Return to Normal?
ie.Visible = True
End
Next
' Clean up
Set ie = Nothing
Application.StatusBar = ""
End Sub
I am trying to get the "85100", not the $85,100
<span class="font-size-base font-normal">Est.</span>
<span itemprop="price" content="85100">
$85,100
</span>
I'm hoping you could help me with this problem.
Thank you in advance.
I would structure something like as follows, where the IE object is created outside the loop. You use css selectors throughout. You may need a timed loop to ensure element is present on page. Use a proper page load wait as shown.
Use an explicit worksheet name to put the worksheet in a variable to work with.
You might want to add a test that myURL has http/https in it as you may have blank cells in range and only want to work with likely urls values.
Option Explicit
Public Sub CommandButton1_Click()
Dim ie As Object, lastrow As Long, i As Long
Dim myURL As String, sdd As String, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1") ' <change as required
Set ie = CreateObject("InternetExplorer.Application")
lastrow = ws.Cells(Rows.Count, "J").End(xlUp).Row
With ie
.Visible = False
For i = 2 To lastrow
myURL = ws.Cells(i, "J").Value
.navigate2 myURL
While .Busy Or .readyState < 4: DoEvents: Wend
sdd = .document.querySelector(".price").getAttribute("content")
ws.Cells(i, "K") = sdd
Next
.Quit
End With
'Application.StatusBar = ""
End Sub
With a timed loop:
Public Sub CommandButton1_Click()
Dim ie As Object, lastrow As Long, i As Long, t As Date, ele As Object
Const MAX_WAIT_SEC As Long = 10
Dim myURL As String, sdd As String, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1") ' <change as required
Set ie = CreateObject("InternetExplorer.Application")
lastrow = ws.Cells(rows.Count, "J").End(xlUp).Row
With ie
.Visible = False
For i = 2 To lastrow
myURL = ws.Cells(i, "J").Value
.Navigate2 myURL
While .Busy Or .readyState < 4: DoEvents: Wend
t = Timer
Do
On Error Resume Next
Set ele = HTMLDoc.querySelector(".price")
On Error GoTo 0
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop While ele Is Nothing
If Not ele Is Nothing Then
sdd = ele.getAttribute("content")
ws.Cells(i, "K") = sdd
End If
Next
.Quit
End With
'Application.StatusBar = vbnullstring
End Sub
I have a VBA code that selects info from drop-down menus on a government website and then submits the query. The requested data then opens up in another IE page. I am trying to copy this data into excel; however, I am unable to do so.
My code currently copies the text on the first IE page that contains the drop-down menus. The government website is: http://www.osfi-bsif.gc.ca/Eng/wt-ow/Pages/FINDAT.aspx
I have look all over the internet for a solution but nothing seems to work...
Here is my code:
Sub GetOsfiFinancialData()
Dim UrlAddress As String
UrlAddress = "http://ws1.osfi-bsif.gc.ca/WebApps/FINDAT/DTIBanks.aspx?T=0&LANG=E"
Dim ie As Object
Set ie = CreateObject("internetexplorer.application")
With ie
.Silent = True
.Visible = False
.navigate UrlAddress
End With
Do Until Not ie.Busy And ie.readyState = 4
DoEvents
Loop
Application.Wait (Now() + TimeValue("00:00:05"))
'Select Bank
ie.document.getElementById("DTIWebPartManager_gwpDTIBankControl1_DTIBankControl1_institutionTypeCriteria_institutionsDropDownList").Value = Z005
'open window with financial data
Dim objButton
Set objButton = ie.document.getElementById("DTIWebPartManager_gwpDTIBankControl1_DTIBankControl1_submitButton")
objButton.Focus
objButton.Click
'select new pop-up window
marker = 0
Set objshell = CreateObject("Shell.Application")
IE_count = objshell.Windows.Count
For x = 0 To (IE_count - 1)
On Error Resume Next ' sometimes more web pages are counted than are open
my_title = objshell.Windows(x).document.Title
If my_title Like "Consolidated Monthly Balance Sheet" & "*" Then 'compare to find if the desired web page is already open
Set ie = objshell.Windows(x)
marker = 1
Exit For
Else
End If
Next
Do Until Not ie.Busy And ie.readyState = 4
DoEvents
Loop
Application.Wait (Now() + TimeValue("00:00:05"))
Dim doc As MSHTML.HTMLDocument
Dim tables As MSHTML.IHTMLElementCollection
Dim table As MSHTML.HTMLTable
Dim clipboard As MSForms.DataObject
Set doc = ie.document
Set tables = doc.getElementsByTagName("body")
Set table = tables(0)
Set clipboard = New MSForms.DataObject
'paste in sheets
Dim test
Set test = ActiveWorkbook.Sheets("Test")
clipboard.SetText table.outerHTML
clipboard.PutInClipboard
test.Range("A1").PasteSpecial xlPasteAll
clipboard.Clear
MsgBox ("Task Completed")
End Sub
Your help is greatly appreciated!
You were using the current test with document.Title. I found that For Each of all windows looking for the full title worked in combination with copy pasting the pop-up window outerHTML. No additional wait time was required.
Inside the For Each Loop, after you reset the IE instance to the new window, you can obtain the new URL with ie.document.url. As you already have the data loaded you might as well just copy paste it straight away in my opinion.
Code:
Option Explicit
Public Sub GetOsfiFinancialData()
Dim UrlAddress As String, objButton, ie As Object
UrlAddress = "http://ws1.osfi-bsif.gc.ca/WebApps/FINDAT/DTIBanks.aspx?T=0&LANG=E"
Set ie = CreateObject("internetexplorer.application")
With ie
.Silent = True
.Visible = False
.navigate UrlAddress
While .Busy Or .readyState < 4: DoEvents: Wend
.document.getElementById("DTIWebPartManager_gwpDTIBankControl1_DTIBankControl1_institutionTypeCriteria_institutionsDropDownList").Value = "Z005"
Set objButton = .document.getElementById("DTIWebPartManager_gwpDTIBankControl1_DTIBankControl1_submitButton")
objButton.Focus
objButton.Click
Dim objShellWindows As New SHDocVw.ShellWindows, currentWindow As IWebBrowser2
For Each currentWindow In objShellWindows
If currentWindow.document.Title = "Consolidated Monthly Balance Sheet - Banks, Trust and Loan" Then
Set ie = currentWindow
Exit For
End If
Next
Dim clipboard As Object
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText ie.document.body.outerHTML
clipboard.PutInClipboard
ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
.Quit
End With
End Sub
References (VBE > Tools > References):
Microsoft Internet Controls
I don't have time to get into all the stuff about controlling one browser from another, but I think you can figure that part out, especially since you made some great progress on this already. Get URL#2 from URL#1, like you are doing, but with some better data controls around it, and then do this...
Option Explicit
Sub Web_Table_Option_One()
Dim xml As Object
Dim html As Object
Dim objTable As Object
Dim result As String
Dim lRow As Long
Dim lngTable As Long
Dim lngRow As Long
Dim lngCol As Long
Dim ActRw As Long
Set xml = CreateObject("MSXML2.XMLHTTP.6.0")
With xml
.Open "GET", "http://ws1.osfi-bsif.gc.ca/WebApps/Temp/2f40b7ef-d024-4eca-a8a3-fb82153efafaFinancialData.aspx", False
.send
End With
result = xml.responseText
Set html = CreateObject("htmlfile")
html.body.innerHTML = result
Set objTable = html.getElementsByTagName("Table")
For lngTable = 0 To objTable.Length - 1
For lngRow = 0 To objTable(lngTable).Rows.Length - 1
For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
ThisWorkbook.Sheets("Sheet1").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
Next lngCol
Next lngRow
ActRw = ActRw + objTable(lngTable).Rows.Length + 1
Next lngTable
End Sub
I have made the macros script which retrieves the data from the URL. What I need is that, I need to increase the date one by one and get the data for each. the URL is like this :
https://www.ukdogracing.net/racecards/01-05-2017/monmore
Ia m able to get the data with this script :
Sub GetData()
Dim IE As Object
Dim doc As Object
Dim strURL As String
Dim I As Integer
For I = 1 To 5
strURL = "https://www.ukdogracing.net/racecards/01-05-2017/monmore" + Trim(Str(I))
Set IE = CreateObject("InternetExplorer.Application")
With IE
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
Next I
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
I = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(I, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(I, 2).Value Then Cells(I, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
I = I - 1 'we decrease row position
Loop
End Sub
But I need the script takes the date part of the URL and add one day each time till today and get the data. for example :
https://www.ukdogracing.net/racecards/01-06-2017/monmore
https://www.ukdogracing.net/racecards/01-07-2017/monmore
etc... How can I make the script to get the data for each day adding one each time.
Thanks in advance.
Replace the first sub with this one and it will run for the specified dates. I couldn't see I having any purpose so i removed it.
Sub GetData()
Dim IE As Object, doc As Object
Dim strURL As String, myDate As Date
Set IE = CreateObject("InternetExplorer.Application")
With IE
For myDate = CDate("01-05-2017") To CDate("01-09-2017")
strURL = "https://www.ukdogracing.net/racecards/" & Format(myDate, "mm-dd-yyyy") & "/monmore" ' Trim(Str(I))
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
Next myDate
.Quit
End With
End Sub