Get Complete data on dropdown in web site with using macro - excel

normally I use an excel chart to upload my data to website. but I realized that in time new additions can be added to a dropdown and unfortunately, i dont know is there anything added or not. So I want to add refresh button to my excel sheet to refresh data inside my excel and get data from website dropdown menu.
below you may find the code in web site. by the way I cannot share the link because it is behind the firewall and credentials. so here is the code
<select name="ddfener" id="ddlfener" tabindex="2" class="normalText">
<option value="0">Select a fener....</option>
<option value="81ca032h">ahmet</option>
<option value="345">mehmet</option>
<option value="123">ayse</option>
I need to download this data like
81ca032h ahmet
345 mehmet
123 ayse
thanks

You have to adjust the macro for the values read out so that they end up in the right places in your Excel table. Everything else is in the comments of the macro:
Sub ReadDropdownValues()
Dim browser As Object
Dim url As String
Dim nodeDropdown As Object
Dim nodesOption As Object
Dim optionTagNo As Long
'Only for this demo
'You write the single readed
'values to your Excel table
Dim result As String
'Place your internal url here
url = "file:///E:/testDropdown.htm"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
'
'This could be problematic on the intranet due to security guidelines
'Set browser = CreateObject("InternetExplorer.Application")
'
'Try this instead to initialize the IE
Set browser = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
browser.Visible = False 'Set to 'True' to see the IE
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Get dropdown html structure
On Error Resume Next
Set nodeDropdown = browser.document.getElementByID("ddlfener")
On Error GoTo 0
'Check if object 'nodeDropdown' was build
If Not nodeDropdown Is Nothing Then
'Create node collection of option tags from object 'nodeDropdown'
Set nodesOption = nodeDropdown.getElementsByTagName("option")
'Loop through all option tags from, the second one
'(The first one is only the placeholder 'Select a fener....')
For optionTagNo = 1 To nodesOption.Length - 1
'Get the value of the attribute 'value'
result = result & Trim(nodesOption(optionTagNo).getAttribute("value"))
'Insert tab only for demo string
result = result & Chr(9)
'Get dropdown value
result = result & Trim(nodesOption(optionTagNo).innertext)
'Insert new line only for demo string
result = result & Chr(13)
Next optionTagNo
Else
'If object 'nodeDropdown' wasn't build
result = "Dropdown not found"
End If
'Clean up
browser.Quit
Set browser = Nothing
Set nodeDropdown = Nothing
Set nodesOption = Nothing
'Show demo result
MsgBox result
End Sub

Related

Excel VBA IE Object and using dropdown list

I am experimenting with web automation and struggling a bit trying to utilize a drop down list.
My code works up to the point of searching for a company name and hitting "go". On the new page I can't seem to find the right code that selects the group of elements that represents the drop down list. I then want to select "100" entries, but I can't even grab the nodes that represent this list.
I have been browsing multiple different pages on stackoverflow that talk about CSS selectors and looked at tutorials but that doesn't seem to help either. I either end up grabbing nothing, or whatever I grab can't use the getElementsByTagName method, which ultimately I am trying to drill down into the td and select nodes . Not sure what to do with those yet, but I can't even grab them. Thoughts?
(note stopline is just a line that I use a breakpoint on to stop my code)
CSS helper website: https://www.w3schools.com/cssref/trysel.asp
Code:
Option Explicit
Sub test()
On Error GoTo ErrHandle
Dim ie As New InternetExplorer
Dim doc As New HTMLDocument
Dim ws As Worksheet
Dim stopLine As Integer
Dim oSearch As Object, oSearchButton As Object
Dim oForm As Object
Dim oSelect As Object
Dim list As Object
Set ws = ThisWorkbook.Worksheets("Sheet1")
ie.Visible = True
ie.navigate "https://www.sec.gov/edgar/searchedgar/companysearch.html"
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.Document
Set oSearch = doc.getElementById("companysearchform")
Set oSearchButton = oSearch.getElementsByTagName("input")(1)
Set oSearch = oSearch.getElementsByTagName("input")(0)
oSearch.Value = "Summit Midstream Partners, LP"
oSearchButton.Click
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.Document
Set list = doc.querySelectorAll("td select")
stopLine = 1
Exit Sub
ErrHandle:
MsgBox Err.Number & " - " & Err.Description, vbCritical
Exit Sub
End Sub
td select will return a single node so you only need querySelector. The node has an id so you might as well use the quicker querySelector("#count") to target the parent select. To change the option you can then use SelectedIndex on the parent select, or, target the child option by its value attribute querySelector("[value='100']").Selected = True. You may then need to attach and trigger change/onchange htmlevent to the parent select to register the change.
However, I would simply extract the company CIK from current page then concatenate the count=100 param into the url and .Navigate2 that using following format:
https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001549922&type=&dateb=&owner=include&count=100&search_text=
You can extract CIK, after initial search company click and wait for page load, with:
Dim cik As String
cik = ie.document.querySelector("[name=CIK]").value
ie.Navigate2 "https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=" & cik & "&type=&dateb=&owner=include&count=100&search_text="
Given several params are left blank you can likely shorten to:
"https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=" & cik & "&owner=include&count=100"
If you are unable to get the initial parent select you probably need a timed loop waiting for that element to be present after clicking the search button. An example is shown here in a StackOverflow answer.

Get hyperlinks from web

I am trying to get the data with hyperlinks from the web. I copied the data from the web and pasted it in excel. Whole data has been pasted in single-cell and no hyperlink carried when I separated the data with text to columns.
Source link: https://www.sec.gov/cgi-bin/current?q1=3&q2=6&q3=
I also tried to dump the data in Excel using the "From Web" option. Unfortunately, no hyperlink carried. Could you help provide suggestions?
Thanks
The macro grabs only all links (second and third column) from the table (which is no table). It takes a moment. Wait till the IE will close. Read the comments in the code please:
Sub LinkList()
Dim url As String
Dim browser As Object
Dim nodeContainer As Object
Dim nodeAllLinks As Object
Dim nodeOneLink As Object
Dim currentRow As Long
Dim controlCounter As Long
ActiveSheet.Columns("B:B").NumberFormat = "#"
ActiveSheet.Columns("D:D").NumberFormat = "#"
currentRow = 2
url = "https://www.sec.gov/cgi-bin/current?q1=3&q2=6&q3="
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set browser = CreateObject("internetexplorer.application")
browser.Visible = True 'You can set this to False to make the IE invisible
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Get the container with all links inside
Set nodeContainer = browser.document.getElementsByTagName("pre")(0)
'Get all links in a node collection
Set nodeAllLinks = nodeContainer.getElementsByTagName("a")
'Get each link
For Each nodeOneLink In nodeAllLinks
'Every second link should be in the same row than the first link of a HTML table row
If controlCounter Mod 2 = 0 Then
With ActiveSheet
'Set link as link
.Hyperlinks.Add Anchor:=.Cells(currentRow, 1), Address:=nodeOneLink.href, TextToDisplay:=nodeOneLink.href
'Write the text of the link from the page to the column afte the link in Excel
.Cells(currentRow, 2).Value = nodeOneLink.innertext
End With
Else
With ActiveSheet
.Hyperlinks.Add Anchor:=.Cells(currentRow, 3), Address:=nodeOneLink.href, TextToDisplay:=nodeOneLink.href
.Cells(currentRow, 4).Value = nodeOneLink.innertext
End With
currentRow = currentRow + 1
End If
'Increment the control variable to devide between first and second link
controlCounter = controlCounter + 1
Next nodeOneLink
'Clean up
browser.Quit
Set browser = Nothing
Set nodeContainer = Nothing
Set nodeAllLinks = Nothing
Set nodeOneLink = Nothing
ActiveSheet.Columns("A:D").EntireColumn.AutoFit
End Sub

Excel macro to search a website with excel data and extract specific results and then loop for next value for another webiste

I have replicated the code in Excel macro to search a website with excel data and extract specific results and then loop for next value, although I get a error on the line URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2 stating "object variable or with block variable not set"
So I am just trying to replicate the code for another website.
This code pulls in a certain text and spits out a value from the webiste.
So I would like to enter in MFR SKU in sheet 1 as such:
Name // SKU // Price
WaterSaverFaucet // SS902BC
After I have created a macro button on sheet 2 and clicking it
Then have it spit out the price.
So that it ends up like this below:
Name // SKU // Price
WaterSaverFaucet // SS902BC // 979.08
I would need this in order to look up multiple items on a website.
Sub LoopThroughBusinesses1()
Dim i As Integer
Dim SKU As String
For i = 2 To Sheet1.UsedRange.Rows.Count
SKU = Sheet1.Cells(i, 2)
Sheet1.Cells(i, 3) = URL_Get_SKU_Query1(SKU)
Next i
End Sub
Function URL_Get_SKU_Query1(strSearch As String) As String ' Change it from a Sub to a Function that returns the desired string
' strSearch = Range("a1") ' This is now passed as a parameter into the Function
Dim entityRange As Range
With Sheet2.QueryTables.Add( _
Connection:="URL;https://www.neobits.com/SearchBySKU.aspx?SearchText=" & strSearch & "&safe=active", _
Destination:=Sheet2.Range("A1")) ' Change this destination to Sheet2
.BackgroundQuery = True
.TablesOnlyFromHTML = True
.Refresh BackgroundQuery:=False
.SaveData = True
End With
' Find the Range that has "Price"
Set entityRange = Sheet2.UsedRange.Find("Price")
' Then return the value of the cell to its' right
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
' Clear Sheet2 for the next run
Sheet2.UsedRange.Delete
End Function
Your logic is flawed unfortunately. You cannot simply take the mechanism from one webpage and assume it works for the next. In this case the solution you are trying will not work. When you enter a SKU into search what actually happens is a page re-direct (302). Not the construction of an url as you have tried. You are getting the error you see primarily due to hitting a page not found - though surfaces due to your element not being found on the 404 page.
Instead, you can use the construct the page in question actually uses for initial url and then you can use xmlhttp which will follow the re-direct as follows:
VBA:
Option Explicit
Public Sub GetPrices()
Dim xhr As XMLHTTP60, html As HTMLDocument, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New XMLHTTP60
Set html = New HTMLDocument
Dim allData()
allData = ws.UsedRange.Value
With xhr
For i = 2 To UBound(allData, 1)
.Open "GET", "https://www.neobits.com/search?keywords=" & allData(i, 2), False
.send
Dim price As Object
html.body.innerHTML = .responseText
Set price = html.querySelector("#main_price")
If Not price Is Nothing Then
allData(i, 3) = price.innerText
Else
allData(i, 3) = "No price found"
End If
Set price = Nothing
Next
End With
ws.Cells(1, 1).Resize(UBound(allData, 1), UBound(allData, 2)) = allData
End Sub
I assume your page set-up, in Sheet1, is as follows:
Required project references:
The two references bounded in red are required. Press Alt+F11 to open the VBE and then go Tools > References and add references. You may have a different version number for xml library - in which case reference will need changing as will code references of
Dim xhr As XMLHTTP60
and
New XMLHTTP60
To run this code:
Press Alt+F11 to open the VBE > Right click in project explorer > Add standard module. Paste code into that standard module > Select anywhere inside the code and press F5, or hit the green Run arrow in the ribbon.
You could further develop, for example, to handle non 200 status codes:
Option Explicit
Public Sub GetPrices()
Dim xhr As XMLHTTP60, html As HTMLDocument, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New XMLHTTP60
Set html = New HTMLDocument
Dim allData(), price As Object
allData = ws.UsedRange.Value
With xhr
For i = 2 To UBound(allData, 1)
.Open "GET", "https://www.neobits.com/search?keywords=" & allData(i, 2), False
.send
If .Status <> 200 Then
allData(i, 3) = "Status not succeeded" '<== Little bit loose but you get the idea.
Else
html.body.innerHTML = .responseText
Set price = html.querySelector("#main_price")
If Not price Is Nothing Then
allData(i, 3) = price.innerText
Else
allData(i, 3) = "No price found"
End If
Set price = Nothing
End If
Next
End With
ws.Cells(1, 1).Resize(UBound(allData, 1), UBound(allData, 2)) = allData
End Sub
' Find the Range that has "Entity Type:"
Set entityRange = Sheet2.UsedRange.Find("Lists At:")
' Then return the value of the cell to its' right
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
The problem is that Range.Find may not find what you're looking for, for various reasons. Always specify the optional parameters to that function, since it otherwise "conveniently remembers" the values from the last time it was invoked - either from other VBA code, or through the Excel UI (IOW there's no way to be 100% sure of what values it's going to be running with if you don't specify them). But even then, if Range.Find doesn't find what it's looking for, it will return Nothing - and you can't just assume that will never happen!
But, reading closer...
' Find the Range that has "Entity Type:"
Set entityRange = Sheet2.UsedRange.Find("Lists At:")
Someone's lying. Read the comment. Now read the code. Who's telling the truth? Don't write comments that say "what" - have comments say "why", and let the code say "what". Otherwise you have situations like that, where it's impossible to tell whether the comment is outdated or the code isn't right, at least not without looking at the worksheet.
In any case, you need to make sure entityRange isn't Nothing before you try to make a member call against it:
If Not entityRange Is Nothing Then
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
End If

How to input values into dropdown box of web page using Excel VBA

I'm trying to operate a website to display desired option chain data with an Excel VBA macro. The website -- CBOE.com -- has an input field for the ticker symbol of the desired option chains. My code has been able to drive that part of the webpage and a default option chain is displayed. It defaults to the most current month that options expire (May 2018 as of this note). From there the user can input other expiration dates for which to have other option chains (for the same symbol) to be retrieved and displayed. This is where my code seems to be breaking down.
Just above the default option chain display is a dropdown input box labeled "Expiration:" where a list of other expiration months can be selected. Once selected, a green Submit button must be clicked to get the specified option chain for the selected expiration month. Alternatively, below the default option chain are explicit filter buttons for expiration months also.
As said, my code gets to the point of specifying the symbol and getting default option chains displayed, but I can't seem to get the dropdown input field for other expiration months to work.
If anyone can see where and how my code is deficient, I'd really appreciate that insight.
Many thanks.
--Mark.
Here is my core code in question:
Sub getmarketdata_V3()
Dim mybrowser As Object, myhtml As String
Dim htmltables As Object, htmltable As Object
Dim htmlrows As Object, htmlrow As Object
Dim htmlcells As Object, htmlcell As Object
Dim xlrow As Long, xlcol As Integer
Dim exitat As Date, symbol As String
Dim flag As Integer
On Error GoTo errhdl
Const myurl = "http://www.cboe.com/delayedquote/quote-table"
symbol = UCase(Trim(Range("ticker").Text))
With Range("ticker").Worksheet
Range(Range("ticker").Offset(1, 0), Cells(Rows.Count, Range("ticker").Column + 13)).ClearContents
End With
Set mybrowser = CreateObject("internetexplorer.application")
mybrowser.Visible = True
mybrowser.navigate myurl
While mybrowser.busy Or mybrowser.readyState <> 4
DoEvents
Wend
With mybrowser.document.all
exitat = Now + TimeValue("00:00:05")
Do
.Item("ctl00$ContentTop$C002$txtSymbol").Value = symbol
.Item("ctl00$ContentTop$C002$btnSubmit").Value = "Submit"
.Item("ctl00$ContentTop$C002$btnSubmit").Click
If Err.Number = 0 Then Exit Do
Err.Clear
DoEvents
If Now > exitat Then Exit Do
Loop
End With
'This With statement is to refresh the mybrowser.document since the prior With statement pulls up a partially new webpage
With mybrowser.document.all
On Error Resume Next
exitat = Now + TimeValue("00:00:05")
'Tried using "ID" label to select desired month--in this case 2018 July is a dropdown option:
'Usind this label seems to blank out the value displayed in the dropdown input box, but does not cause
'any of the options to display nor implant "2018 July" in it either. It just remains blank and no new option
'chain is retrieved.
.Item("ContentTop_C002_ddlMonth").Select
.Item("ContentTop_C002_ddlMonth").Value = "2018 July"
.Item("ContentTop_C002_ddlMonth").Click
'Then tried using "Name" label to select desired month--in this case 2018 July is an option:
' .Item("ctl00$ContentTop$C002$ddlMonth").Value = "2018 July"
' .Item("ctl00$ContentTop$C002$ddlMonth").Click
' .Item("ctl00$ContentTop$C002$btnFilter").Value = "View Chain"
' .Item("ctl00$ContentTop$C002$btnFilter").Click
End With
While mybrowser.busy Or mybrowser.readyState <> 4
DoEvents
Wend
'Remaining logic, except for this error trap logic deals with the option chain results once it has been successfully retrieved.
'For purposes of focus on the issue of not being able to successfully have such a table displayed, that remaining process logic is not
'included here.
errhdl:
If Err.Number Then MsgBox Err.Description, vbCritical, "Get data"
On Error Resume Next
mybrowser.Quit
Set mybrowser = Nothing
Set htmltables = Nothing
End Sub
For your code:
These 2 lines change the month and click the view chain (I tested with symbol FLWS). Make sure you have sufficient delays for page to actually have loaded.
mybrowser.document.querySelector("#ContentTop_C002_ddlMonth").Value = "201809"
mybrowser.document.querySelector("#ContentTop_C002_btnFilter").Click
I found the above sketchy for timings when added into your code so I had a quick play with Selenium basic as well. Here is an example with selenium:
Option Explicit
'Tools > references > selenium type library
Public Sub GetMarketData()
Const URL As String = "http://www.cboe.com/delayedquote/quote-table"
Dim d As ChromeDriver, symbol As String
symbol = "FLWS"
Set d = New ChromeDriver
With d
.Start
.Get URL
Dim b As Object, c As Object, keys As New keys
Set b = .FindElementById("ContentTop_C002_txtSymbol")
b.SendKeys symbol
.FindElementById("ContentTop_C002_btnSubmit").Click
Set c = .FindElementById("ContentTop_C002_ddlMonth")
c.Click
c.SendKeys keys.Down 'move one month down
.FindElementById("ContentTop_C002_btnFilter").Click
Stop '<<delete me later
.Quit
End With
End Sub
Try the below approach, in case you wanna stick to IE. I tried to kick out hardcoded delay from the script. It should get you there. Make sure to fill in the text field with the appropriate ticker from the below script before execution.
There you go:
Sub HandleDropDown()
Const url As String = "http://www.cboe.com/delayedquote/quote-table"
Dim IE As New InternetExplorer, Html As HTMLDocument, post As Object, elem As Object
With IE
.Visible = True
.navigate url
While .Busy Or .readyState <> 4: DoEvents: Wend
Set Html = .document
End With
Do: Set post = Html.getElementById("ContentTop_C002_txtSymbol"): DoEvents: Loop While post Is Nothing
post.Value = "tickername" ''make sure to fill in this box with appropriate symbol
Html.getElementById("ContentTop_C002_btnSubmit").Click
Do: Set elem = Html.getElementById("ContentTop_C002_ddlMonth"): DoEvents: Loop While elem Is Nothing
elem.selectedIndex = 2 ''just select the month using it's dropdown order
Html.getElementById("ContentTop_C002_btnFilter").Click
End Sub
Reference to add to the library:
Microsoft Internet Controls
Microsoft HTML Object Library

How do I keep the new structure of web imported data after refresh?

I am creating an Excel database. I would like to import names, emails and job positions of all employees of a firm from the firm website.
I choose Data->From Web and select the whole page, as it is the only possibility.
The page shows no table with data; just a long list of photos of employees with names, emails and job positions next to them
I import the data into my Excel spreadsheet: the format is very bad. So I begin cut and paste creating a column for "names", one for "email" and similarly for "job position". All other information is manually canceled.
I would like to refresh data keeping this new format. Unfortunately, every time I refresh the imported data using the "refresh all" button, they return to the original format.
How can I keep the new format of my web imported data, after refresh?
I thank you all for your support!
Kr,
A
I've put together an example that will extract the name and title from that page you specified and put them into sheet 1.
The code will only work providing the layout of the underlying html remains the same. It does not support updating of an existing list (anything on sheet 1 is removed prior to reading the list again)
To use this code you must place it in a new code module (not the worksheet or workbook sections) and you can run it either from the code editor or via the macros menu in the main excel window.
' Note: This code requires the following references to be loaded.
' Microsoft HTML Object Library (mshtml.tlb)
' Microsoft Internet Controls (ieframe.dll)
' To add a reference
' In the VBA Code Editor, in the Tools Menu click the References item
' Scroll through the list and ensure that the references are selected
' Press OK and your done.
Sub Scrape()
Dim Browser As InternetExplorer
Dim Document As HTMLDocument
Dim Element As IHTMLElement
Dim Elements As IHTMLElementCollection
Dim empName As String
Dim empTitle As String
Dim Sheet As Worksheet
Set Sheet = ThisWorkbook.ActiveSheet
Sheet.UsedRange.ClearContents ' Nuke the old list
Set Browser = New InternetExplorer
Browser.navigate "http://www.hsbc.com/about-hsbc/leadership"
Do While Browser.Busy And Not Browser.readyState = READYSTATE_COMPLETE
DoEvents
Loop
Set Document = Browser.Document
Set Elements = Document.getElementsByClassName("profile-col1")
For Each Element In Elements
empName = Trim(Element.Children(1).Children(0).innerText)
empTitle = Trim(Element.Children(1).Children(1).innerText)
Sheet.Range("A1:B1").Insert xlShiftDown
Sheet.Cells(1, 1).Value = empName
Sheet.Cells(1, 2).Value = empTitle
'Debug.Print "[ name] " & empName
'Debug.Print "[ title] " & empTitle
Next Element
Set Browser = Nothing
Set Elements = Nothing
End Sub

Resources