Using MSXML in a VBA script to pull website data - excel

I have the following code from http://dailydoseofexcel.com/archives/2011/03/08/get-data-from-website-that-requires-a-login/#comment-60553
Sub GetTable()
Dim xml As Object ' MSXML2.XMLHTTP60
Dim htmlDoc As Object ' MSHTML.HTMLDocument
Dim htmlBody As Object ' MSHTML.HTMLBody
Dim ieTable As Object
Dim clip As DataObject
Set xml = GetMSXML
With xml
.Open "POST", "https://web.site", False
.send "username=myname&password=mypassword"""
End With
With xml
.Open "POST", "https://web.site/anotherpage", False
End With
Set htmlDoc = CreateHTMLDoc
Set htmlBody = htmlDoc.Body
htmlBody.innerHTML = xml.responseText
Set ieTable = htmlBody.all.Item("report")
'copy the tables html to the clipboard and paste to teh sheet
If Not ieTable Is Nothing Then
Set clip = New DataObject
clip.SetText "<html>" & ieTable.outerHTML & "</html>"
clip.PutInClipboard
Range("A1").Select
ActiveSheet.PasteSpecial "Unicode Text"""
End If
End Sub
Function CreateHTMLDoc() As Object ' MSHTML.HTMLDocument
Set CreateHTMLDoc = CreateObject("htmlfile")
End Function
Function GetMSXML() As Object ' MSXML2.XMLHTTP
On Error Resume Next
Set GetMSXML = CreateObject("MSXML2.XMLHTTP")
End Function
Using this code I am attempting to access the site web.site and pass it a username and password to login, before proceeding to another page on the site, before copying the content of a table (results) into sheet1 of the excel workbook.
I have tried to debug this using f8 but without the visual browser that I would get if I were to follow this page http://dailydoseofexcel.com/archives/2011/03/08/get-data-from-website-that-requires-a-login/
then its a bit difficult to see exactly what is happening and where it is failing.

Try the following code to assist you go onto the site, if you have any questions then just leave a comment on my channel https://www.youtube.com/watch?v=hfAhmae4iqA ;
Dim IEe As InternetExplorer
Dim doc, element
Set IEe = New InternetExplorer
IEe.Visible = False 'make true if you want to the internet explorer
IEe.Navigate "YOUR WEBSITE"
Do While IEe.ReadyState = 4: DoEvents: Loop
Do Until IEe.ReadyState = 4: DoEvents: Loop
Set element = IEe.Document.getElementByID(INSERT ELEMENT ID) 'RIGHT CLICK ON WEBSITE AND SAY INSPECT ELEMENT CLICK THE MOUSE ICON AND THEN CLICK THE TEXT BOX WHERE THE PASSWORD OR USERNAME SHOULD BE INSERTED
element.Value = "USERNAME"
Set element = IEe.Document.getElementByID(INSERT ELEMENT ID) 'THE FIRST IS FOR USERNAME THE NEXT FOR PASSWORD
element.Value = "PASSWORD" 'remember storing a password in a macro is not safe

Related

Select Item in a dropdown from website via Excel Macro

I would like to select the options "Addition, Bulk, Reduction" using excel VBA
This what I have so far, but nothing is being selected.
Dim ie As InternetExplorer
Set ie = New InternetExplorer
ie.Visible = True
ie.navigate "my URL"
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
'time_adjust_group_ident = Reduction
Dim doc As HTMLDocument
Set doc = ie.document
doc.getElementById("time_adjust_group_ident").Value = "Reduction"
End Sub
You don't need Internet Explorer object for this. Please take a look in the code below where I use MSXML2.XMLHTTP to make a HTTP request and get the HTML response as a string, and then parse it using the HTMLFile object.
I'm using the CreateObject method instead of adding the references via Tools > References, so you can run this code anywhere without having to add references manually every time you open this in a different machine.
In this example, I'm retrieving the children elements of the language-selector dropdown in a given website, and looping through it using a For Each to write each child element's content in a spreadsheet row.
Sub LoadHtml()
Dim strUrl As String
strUrl = "https://developer.mozilla.org/en-US/docs/Web/HTML/Element/select"
Dim httpRequest As Object
Set httpRequest = CreateObject("MSXML2.XMLHTTP")
With httpRequest
.Open "GET", strUrl, False
.send
End With
Dim html As Object
Set html = CreateObject("HTMLFile")
html.body.innerHTML = httpRequest.ResponseText
Dim child As Object
Dim row As Integer
row = 1
For Each child In html.getElementById("language-selector").Children
Range("A" & row) = child.innerText
row = row + 1
Next
End Sub

When the search button is clicked using vba the text entered in search box is not seen by web page

I have written vba code for entering manufacturer part number in search box of below website and clicking on search icon. It is able enter manufacturer part number in search box and click on search icon, but when "search icon is clicked the text entered in the text box is not picked up". It searches empty data.
'HTML Part for search icon
<em class="fa fa-search" aria-hidden="true" style="color: gray;"></em>
It being almost a month I have tried various different way which was also mentioned on stack overflow, like using "createEvent("keyboardevent")" but nothing worked.
' VBA code
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://ecat.aptiv.com"
Do While IE.readyState < READYSTATE_COMPLETE
Loop
Dim idoc As MSHTML.HTMLDocument
Set idoc = IE.document
idoc.getElementById("searchUserInput").Value = "33188785"
Dim doc_ele As MSHTML.IHTMLElement
Dim doc_eles As MSHTML.IHTMLElementCollection
Set doc_eles = idoc.getElementsByTagName("a")
For Each doc_ele In doc_eles
If doc_ele.getAttribute("ng-click") = "SearchButtonClick(1)" Then
doc_ele.Click
Exit Sub
Else
End If
Next doc_ele
End Sub
The page does an xhr request to retrieve the search results. You can find it in the network tab after clicking submit. This means you can avoid, in this case, the expense of a browser and issue an xhr request. The response is json so you do need a json parser to handle the results.
I would use jsonconverter.bas to parse the json. After installing the code from that link in a standard module called JsonConverter, go to VBE > Tools > References > Add a reference to Microsoft Scripting Runtime
I dimension an array to hold the results. I determine rows from the number of items in the json collection returned and the number of columns from the size of the first item dictionary. I loop the json object, and inner loop the dictionary keys of each dictionary in collection, and populate the array. I write the array out in one go at end which is less i/o expensive.
Option Explicit
Public Sub GetInfo()
Dim json As Object, ws As Worksheet, headers()
Dim item As Object, key As Variant, results(), r As Long, c As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://ecat.aptiv.com/json/eCatalogSearch/SearchProducts?filter=All&options=&pageSize=10&search=33188785", False
.send
Set json = JsonConverter.ParseJson(.responseText)("Products")
End With
headers = json.item(1).keys
ReDim results(1 To json.Count, 1 To UBound(headers) + 1)
For Each item In json
r = r + 1: c = 1
For Each key In item.keys
results(r, c) = item(key)
c = c + 1
Next
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
You can do this instead:
txt = "33188785"
IE.navigate "https://ecat.aptiv.com/feature?search=" & txt
This will take you straight to the Search Result.
Code:
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Dim txt As String
Set IE = New InternetExplorer
txt = "33188785"
IE.Visible = True
IE.navigate "https://ecat.aptiv.com/feature?search=" & txt
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
End Sub
This will be faster as You will only have to load one page.
Why that's happening, i am not sure, but seems like the TextBox that is used to input text is not being Activated when adding text automatically to it. It is being activated when we click inside it.
I got the solution for above problem from Mrxel.com below is the link for that post.
https://www.mrexcel.com/forum/excel-questions/1105434-vba-ie-automation-issue-angularjs-input-text-post5317832.html#post5317832
In this case I need to enter the search string character by character and sendKeys and input events inside the loop. Below is the working vba code.
Sub AptivScrapping()
Dim IE As SHDocVw.InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://ecat.aptiv.com"
Do While IE.readyState < READYSTATE_COMPLETE
Loop
Dim idoc As MSHTML.HTMLDocument
Set idoc = IE.document
IE.document.getElementById("searchUserInput").Focus = True
IE.document.getElementById("searchUserInput").Select
sFieldInput = "33188785"
For s = 1 To Len(sFieldInput)
Application.SendKeys Mid(sFieldInput, s, 1)
While IE.readyState < 4 Or IE.Busy
Application.Wait DateAdd("s", LoopSeconds, Now)
Wend
Next s
IE.document.getElementById("searchUserInput").Focus = False
Dim doc_ele As MSHTML.IHTMLElement
Dim doc_eles As MSHTML.IHTMLElementCollection
Set doc_eles = idoc.getElementsByTagName("a")
For Each doc_ele In doc_eles
If doc_ele.getAttribute("ng-click") = "SearchButtonClick(1)" Then
doc_ele.Click
Exit Sub
Else
End If
Next doc_ele
End Sub

VBA : how to connect MSXML2.XMLHTTP60 response to IHTMLDocument(iframe)

I am trying to use MSXML2 and IHTMLDocument
to deal with iframe part of HTML web page.
I want to use MSXML2 and save it to better "capture" the data, thinking it's faster than just using InternetExplorer or VBA selenium reference supported by VBA menu.
(I don't want to avoid using IE or selenium as much as possible)
But I couldn't find out how to save document as XML format(to take advantage of its speed) and at the same time click on the element in the document without the help of browser(ie or selenium).
And even after clicking some tab(id="cns_Tab21") on this web page, I have difficulty retrieving data.
So my question is..
1> Is it possible to minimize the use of browser for clicking?
2> Even after clicking(using Selenium), it throws an xpath related error in VBA editor.
Thank you for your answer in advance and the URL used for this is
http://bitly.kr/finance
and the iframe inside the link is http://bitly.kr/LT0aCb
'I declared objects
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim iframeDoc As IHTMLDocument
'and saved XML data to HTML format
HTMLDoc.body.innerHTML = XMLReq.responseText
'and trying to save this HTML to iframe...
Set iframeDoc = HTMLDoc.getElementById("coinfo_cp")
'I tried .contentDocument but it maybe HTMLdoc doesn't have this property.
and I don't know how to access information I saved to iframeDoc above.
'And after I use Selenium I can't figure out why it throw an error
For Each ele In selenium.FindElementsByTag("th")
If ele.Attribute("innerText") = "CAPEX" Then
Debug.Print ele.FindElementsByXPath("./../td").Attribute("innerText")
This post isn't a duplicate since I am trying to use XML to handle iframe element and without InternetExplorer reference in VBA Excel.(ie.document)
You can make replicate the xhr request the page makes when that tab (not iframe) is selected. I use clipboard to copy table to Excel. Note: url I am using is from our discussions. This info should be reflected in question.
Option Explicit
Public Sub GetTable()
'VBE > Tools > References > Microsoft HTML Object Library
Dim html As HTMLDocument, hTable As HTMLTable, clipboard As Object
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://navercomp.wisereport.co.kr/v2/company/ajax/cF1001.aspx?cmp_cd=005930&fin_typ=0&freq_typ=Y&encparam=ZXR1cWFjeGJnS1lWOHhCYmNScmJXUT09&id=bG05RlB6cn", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
End With
Set hTable = html.querySelector(".hbG05RlB6cn + .gHead01")
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") ' New DataObject
clipboard.SetText hTable.outerHTML
clipboard.PutInClipboard
ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
End Sub
You can find the params of the ajax url for the tab content update in the scripts of the page
Along with the target for the update:
This needs tidying up:
Option Explicit
Public Sub GetTable()
'https://navercomp.wisereport.co.kr/v2/company/c1010001.aspx?cmp_cd=005930
'VBE > Tools > References > Microsoft HTML Object Library
Dim html As HTMLDocument, hTable As HTMLTable, clipboard As Object, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://navercomp.wisereport.co.kr/v2/company/ajax/cF1001.aspx?cmp_cd=005930&fin_typ=0&freq_typ=Y&encparam=ZXR1cWFjeGJnS1lWOHhCYmNScmJXUT09&id=bG05RlB6cn", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
End With
Set hTable = html.querySelector(".hbG05RlB6cn + .gHead01") '2nd tab. CAPEX row
Dim html2 As HTMLDocument, i As Long
Set html2 = New HTMLDocument
html2.body.innerHTML = hTable.outerHTML
Dim tableBodyRows As Object, tableBodyRowLength As Long, tableHeaderRowLength As Long, tableHeaderRows As Object, targetRow As Long
Set tableBodyRows = html2.querySelectorAll("tbody tr .bg")
tableBodyRowLength = tableBodyRows.Length
tableHeaderRowLength = html2.querySelectorAll("thead tr").Length + 2
For i = 0 To tableBodyRowLength - 1
If Trim$(tableBodyRows.item(i).innerText) = "CAPEX" Then
targetRow = i + tableHeaderRowLength + 1
Exit For
End If
Next
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") ' New DataObject
clipboard.SetText hTable.outerHTML
clipboard.PutInClipboard
ws.Cells(1, 1).PasteSpecial
Dim unionRng As Range
For i = (tableHeaderRowLength + 1) To (tableBodyRowLength + tableHeaderRowLength)
If i <> targetRow Then
If Not unionRng Is Nothing Then
Set unionRng = Union(ws.rows(i), unionRng)
Else
Set unionRng = ws.rows(i)
End If
End If
Next
If Not unionRng Is Nothing Then unionRng.Delete
End Sub

Looping through a row and copying each cell In a specific procedure

What I have to do is use Excel VBA to:
login to Amazon Seller
open a workbook
loop through a column to get an order number
put it in the search box
hit the search button
go to the order page and extract the data
then have the extracted data go back into a specified column in
another Excel workbook
The loop and order number parts are what I'm currently stumped on. I've figured out this much code as of this moment:
Sub MyAmazonSeller()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim oSignInLink As HTMLLinkElement
Dim oInputEmail As HTMLInputElement
Dim oInputPassword As HTMLInputElement
Dim oInputSigninButton As HTMLInputButtonElement
'InputSearchOrder will be the destination for order numbers taken from the workbook
Dim InputSearchOrder As HTMLInputElement
Dim InputSearchButton As HTMLInputButtonElement
Dim IE As InternetExplorer
Dim AAOrder As Workbook
Dim AAws As Worksheet
MyURL = "https://sellercentral.amazon.com/gp/homepage.html"
Set IE = New InternetExplorer
' Open the browser and navigate.
With IE
.Silent = True
.Navigate MyURL
.Visible = True
Do
DoEvents
Loop Until .ReadyState = READYSTATE_COMPLETE
End With
' Get the html document.
Set HTMLDoc = IE.Document
' See if you have the sign in link is because you are in the main
' page
Set oSignInLink = HTMLDoc.getElementById("signin-button-container")
If Not oSignInLink Is Nothing Then
oSignInLink.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End If
' Get the email field and the next button
Set oInputEmail = HTMLDoc.getElementById("username")
Set oInputPassword = HTMLDoc.getElementById("password")
' Click the button and wait
oInputEmail.Value = "xxxxxx#xxxxxx.net"
' Get the password field and the sign in button
Set oInputPassword = HTMLDoc.getElementById("password")
Set oInputSigninButton = HTMLDoc.getElementById("sign-in-button")
' Click the button and wait
oInputPassword.Value = "xxxxxxxx"
oInputSigninButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("0:00:05"))
Set AAOrder = Application.Workbooks.Open("Z:\Employee Folders\Employee\trackingnumber_sample_spreadsheet.xls")
Set AAws = AAws.Worksheets("PrimeOrdersWithNoFulfillmentRe")
Set InputSearchOrder = HTMLDoc.getElementById("sc-search-field")
'What I'm currently stuck on
InputSearchOrder.Value = "001-7163923-7572632"
Set InputSearchButton = HTMLDoc.getElementsByClassName("sc-search-button")(0)
InputSearchButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
'Was able to add this snippet, but I'm getting an error 13, most likely with
'my e variable. I'm basically trying to do a loop within a loop, extracting 5
'pieces of data and sticking them back into their respective columns in the
'original Excel sheet. The problem comes when scraping the HTML. I'm basically
'trying to get text in the tables which have a few levels and it's frustrating
'me to no end.
With HTMLDoc
Set elems = HTMLDoc.getElementsByTagName("td")
For Each e In elems
If e.innerText Like "*1Z*" Then
Range("D2").Value = e.innerText
End If
Next e
End With
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub

Get data from website [duplicate]

This question already has an answer here:
Get data from listings on a website to excel VBA
(1 answer)
Closed 9 years ago.
<span itemprop="streetAddress">
**94 Grand St**
</span>
how to get this data through getelementby method in excel vba
I have tried getelementbyid, getelementbyname etc. but nothing is working
Option Explicit
Sub find()
'Uses late binding, or add reference to Microsoft HTML Object Library
' and change variable Types to use intellisense
Dim ie As Object 'InternetExplorer.Application
Dim html As Object 'HTMLDocument
Dim Listings As Object 'IHTMLElementCollection
Dim l As Object 'IHTMLElement
Dim r As Long
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = False
.Navigate "http://www.yelp.com/biz/if-boutique-new-york#query:boutique"
' Don't show window
'Wait until IE is done loading page
Do While .readyState <> 4
Application.StatusBar = "Downloading information, Please wait..."
DoEvents
Loop
Set html = .Document
End With
Set Listings = html.getElementsByTagName("span") ' ## returns the list
MsgBox (Listings(0))
For Each l In Listings
'## make sure this list item looks like the listings Div Class:
' then, build the string to put in your cell
Range("A1").Offset(r, 0).Value = l.innerText
r = r + 1
Next
Set html = Nothing
Set ie = Nothing
End Sub
The above program is used by me to get the innerText value inside the span tag... but its not working
For the single result you are looking for in detail you want to use these two lines in your code (there is only 1 listing at the detailed level)
Adapt your IE code
Set Listings = html.getElementbyid("bizInfoBody") ' ## returns the list
Range("A1").Offset(r, 0).Value = Listings.innerText
with XMLHTTP
Sub GetTxt()
Dim objXmlHTTP As Object
Dim objHtmlDoc As Object
Dim objHtmlBody As Object
Dim objTbl As Object
Dim strResponse As String
Dim strSite As String
Set objHtmlDoc = CreateObject("htmlfile")
Set objHtmlBody = objHtmlDoc.body
Set objXmlHTTP = CreateObject("MSXML2.XMLHTTP")
strSite = "http://www.yelp.com/biz/if-boutique-new-york"
With objXmlHTTP
.Open "GET", strSite, False
.Send
If .Status = 200 Then
strResponse = .responseText
objHtmlBody.innerHTML = objXmlHTTP.responseText
Set objTbl = objHtmlBody.Document.getElementbyid("bizInfoBody")
MsgBox objTbl.innerText
End If
End With
End Sub

Resources