Error 1004 on Resizing range and Transposing array - excel

I will not post full code since it's quite huge - I will focus on a part that is causing an error.
The macro is supposed to copy URL's generated in excel, open them in IE, copy source code to another sheet, look for something in this code, save results in specific cell, remove sheet and go to next URL. It works quite well, it copies the source codes for many URLs, but for some URLs it just fails. When I open the URLs manually - they work perfectly, but somehow Excel throws me an error for them.
Could you guys check the below could to help me better understand where is the problem?
Here are two samples links:
This one works good - link1
This one throws error 1004 - link2
And here is the code:
Sub CC_Check()
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim URL As Range
Dim Rng As Range
Dim ws1 As Worksheet
Set ws1 = Worksheets("One Code")
Set ie = New InternetExplorer
Set Rng = ws1.Range("A3:A18")
For Each URL In Rng
ThisWorkbook.Sheets.Add After:=Sheets(Sheets.Count)
ActiveSheet.Name = ws1.Cells(URL.Row, 2).Value & "_" & ws1.Cells(6, 7).Value
ie.Visible = False
ie.navigate URL.Value
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set html = ie.document
Range("A1").Value = html.DocumentElement.outerHTML
Dim arr
arr = Split(html.DocumentElement.outerHTML, vbLf)
Range("A1").Resize(UBound(arr) + 1, 1).Value = Application.Transpose(arr) '<-- this line causing error 1004

The Application.Transpose has a number of problems. It fails when
The array has only one member (UBound(arr) = 1)
One of the strings has a length > 32K (but I have seen other cases where it failed already when a string had more that 255 chars)
The array size is larger than 64K (however, in Excel 2016 this will not cause a runtime error but a crippled array with less size
So, bet bet is to do the transform by hand which is rather easy. You should, by the way, use a Worksheet-variable for the sheets you add - never rely on Activesheet. The following code will create the new sheet only if it doesn't exist (else it will clear it's content so you can run the code several times
Set newWs = Nothing
On Error Resume Next
Set newWs = ThisWorkbook.Sheets(wsName)
On Error GoTo 0
If newWs Is Nothing Then
' Sheet doesn't exist, create a new one and name it
Set newWs = ThisWorkbook.Sheets.Add(After:=Sheets(Sheets.Count))
newWs.Name = ws1.Cells(URL.row, 2).Value & "x" & ws1.Cells(6, 7).Value
Else
' Sheet already there, clear its content
newWs.UsedRange.ClearContents
End If
(..Load HTML and split..)
' Do your own transpose into a 2nd array and dump that into sheet
Dim brr
ReDim brr(LBound(arr) To UBound(arr), 1 To 1) ' Make it 2-dimensional
Dim i As Long
For i = LBound(arr) To UBound(arr)
brr(i, 1) = arr(i)
Next i
Range("A1").Resize(UBound(arr) + 1, 1).Value = brr

Related

Get hyperlinks from web

I am trying to get the data with hyperlinks from the web. I copied the data from the web and pasted it in excel. Whole data has been pasted in single-cell and no hyperlink carried when I separated the data with text to columns.
Source link: https://www.sec.gov/cgi-bin/current?q1=3&q2=6&q3=
I also tried to dump the data in Excel using the "From Web" option. Unfortunately, no hyperlink carried. Could you help provide suggestions?
Thanks
The macro grabs only all links (second and third column) from the table (which is no table). It takes a moment. Wait till the IE will close. Read the comments in the code please:
Sub LinkList()
Dim url As String
Dim browser As Object
Dim nodeContainer As Object
Dim nodeAllLinks As Object
Dim nodeOneLink As Object
Dim currentRow As Long
Dim controlCounter As Long
ActiveSheet.Columns("B:B").NumberFormat = "#"
ActiveSheet.Columns("D:D").NumberFormat = "#"
currentRow = 2
url = "https://www.sec.gov/cgi-bin/current?q1=3&q2=6&q3="
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set browser = CreateObject("internetexplorer.application")
browser.Visible = True 'You can set this to False to make the IE invisible
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Get the container with all links inside
Set nodeContainer = browser.document.getElementsByTagName("pre")(0)
'Get all links in a node collection
Set nodeAllLinks = nodeContainer.getElementsByTagName("a")
'Get each link
For Each nodeOneLink In nodeAllLinks
'Every second link should be in the same row than the first link of a HTML table row
If controlCounter Mod 2 = 0 Then
With ActiveSheet
'Set link as link
.Hyperlinks.Add Anchor:=.Cells(currentRow, 1), Address:=nodeOneLink.href, TextToDisplay:=nodeOneLink.href
'Write the text of the link from the page to the column afte the link in Excel
.Cells(currentRow, 2).Value = nodeOneLink.innertext
End With
Else
With ActiveSheet
.Hyperlinks.Add Anchor:=.Cells(currentRow, 3), Address:=nodeOneLink.href, TextToDisplay:=nodeOneLink.href
.Cells(currentRow, 4).Value = nodeOneLink.innertext
End With
currentRow = currentRow + 1
End If
'Increment the control variable to devide between first and second link
controlCounter = controlCounter + 1
Next nodeOneLink
'Clean up
browser.Quit
Set browser = Nothing
Set nodeContainer = Nothing
Set nodeAllLinks = Nothing
Set nodeOneLink = Nothing
ActiveSheet.Columns("A:D").EntireColumn.AutoFit
End Sub

Excel macro to search a website with excel data and extract specific results and then loop for next value for another webiste

I have replicated the code in Excel macro to search a website with excel data and extract specific results and then loop for next value, although I get a error on the line URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2 stating "object variable or with block variable not set"
So I am just trying to replicate the code for another website.
This code pulls in a certain text and spits out a value from the webiste.
So I would like to enter in MFR SKU in sheet 1 as such:
Name // SKU // Price
WaterSaverFaucet // SS902BC
After I have created a macro button on sheet 2 and clicking it
Then have it spit out the price.
So that it ends up like this below:
Name // SKU // Price
WaterSaverFaucet // SS902BC // 979.08
I would need this in order to look up multiple items on a website.
Sub LoopThroughBusinesses1()
Dim i As Integer
Dim SKU As String
For i = 2 To Sheet1.UsedRange.Rows.Count
SKU = Sheet1.Cells(i, 2)
Sheet1.Cells(i, 3) = URL_Get_SKU_Query1(SKU)
Next i
End Sub
Function URL_Get_SKU_Query1(strSearch As String) As String ' Change it from a Sub to a Function that returns the desired string
' strSearch = Range("a1") ' This is now passed as a parameter into the Function
Dim entityRange As Range
With Sheet2.QueryTables.Add( _
Connection:="URL;https://www.neobits.com/SearchBySKU.aspx?SearchText=" & strSearch & "&safe=active", _
Destination:=Sheet2.Range("A1")) ' Change this destination to Sheet2
.BackgroundQuery = True
.TablesOnlyFromHTML = True
.Refresh BackgroundQuery:=False
.SaveData = True
End With
' Find the Range that has "Price"
Set entityRange = Sheet2.UsedRange.Find("Price")
' Then return the value of the cell to its' right
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
' Clear Sheet2 for the next run
Sheet2.UsedRange.Delete
End Function
Your logic is flawed unfortunately. You cannot simply take the mechanism from one webpage and assume it works for the next. In this case the solution you are trying will not work. When you enter a SKU into search what actually happens is a page re-direct (302). Not the construction of an url as you have tried. You are getting the error you see primarily due to hitting a page not found - though surfaces due to your element not being found on the 404 page.
Instead, you can use the construct the page in question actually uses for initial url and then you can use xmlhttp which will follow the re-direct as follows:
VBA:
Option Explicit
Public Sub GetPrices()
Dim xhr As XMLHTTP60, html As HTMLDocument, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New XMLHTTP60
Set html = New HTMLDocument
Dim allData()
allData = ws.UsedRange.Value
With xhr
For i = 2 To UBound(allData, 1)
.Open "GET", "https://www.neobits.com/search?keywords=" & allData(i, 2), False
.send
Dim price As Object
html.body.innerHTML = .responseText
Set price = html.querySelector("#main_price")
If Not price Is Nothing Then
allData(i, 3) = price.innerText
Else
allData(i, 3) = "No price found"
End If
Set price = Nothing
Next
End With
ws.Cells(1, 1).Resize(UBound(allData, 1), UBound(allData, 2)) = allData
End Sub
I assume your page set-up, in Sheet1, is as follows:
Required project references:
The two references bounded in red are required. Press Alt+F11 to open the VBE and then go Tools > References and add references. You may have a different version number for xml library - in which case reference will need changing as will code references of
Dim xhr As XMLHTTP60
and
New XMLHTTP60
To run this code:
Press Alt+F11 to open the VBE > Right click in project explorer > Add standard module. Paste code into that standard module > Select anywhere inside the code and press F5, or hit the green Run arrow in the ribbon.
You could further develop, for example, to handle non 200 status codes:
Option Explicit
Public Sub GetPrices()
Dim xhr As XMLHTTP60, html As HTMLDocument, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New XMLHTTP60
Set html = New HTMLDocument
Dim allData(), price As Object
allData = ws.UsedRange.Value
With xhr
For i = 2 To UBound(allData, 1)
.Open "GET", "https://www.neobits.com/search?keywords=" & allData(i, 2), False
.send
If .Status <> 200 Then
allData(i, 3) = "Status not succeeded" '<== Little bit loose but you get the idea.
Else
html.body.innerHTML = .responseText
Set price = html.querySelector("#main_price")
If Not price Is Nothing Then
allData(i, 3) = price.innerText
Else
allData(i, 3) = "No price found"
End If
Set price = Nothing
End If
Next
End With
ws.Cells(1, 1).Resize(UBound(allData, 1), UBound(allData, 2)) = allData
End Sub
' Find the Range that has "Entity Type:"
Set entityRange = Sheet2.UsedRange.Find("Lists At:")
' Then return the value of the cell to its' right
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
The problem is that Range.Find may not find what you're looking for, for various reasons. Always specify the optional parameters to that function, since it otherwise "conveniently remembers" the values from the last time it was invoked - either from other VBA code, or through the Excel UI (IOW there's no way to be 100% sure of what values it's going to be running with if you don't specify them). But even then, if Range.Find doesn't find what it's looking for, it will return Nothing - and you can't just assume that will never happen!
But, reading closer...
' Find the Range that has "Entity Type:"
Set entityRange = Sheet2.UsedRange.Find("Lists At:")
Someone's lying. Read the comment. Now read the code. Who's telling the truth? Don't write comments that say "what" - have comments say "why", and let the code say "what". Otherwise you have situations like that, where it's impossible to tell whether the comment is outdated or the code isn't right, at least not without looking at the worksheet.
In any case, you need to make sure entityRange isn't Nothing before you try to make a member call against it:
If Not entityRange Is Nothing Then
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
End If

How to scrape data from Bloomberg's website with VBA

Background
Disclaimer: I am a beginner, please bare with my - most plausibly wrong - code.
I want to update currency pairs' value (PREV CLOSE) with a button-enabled-VBA macro. My Excel worksheet contains FX pairs (e.g. USDGBP) on column G:G which are then used to run a FOR loop for every pair in the column.
The value would then be stored in column I:I
Right now, the problem according to the Debugger lies in one line of code that I will highlight below
Sources
I got some inspiration from https://www.youtube.com/watch?v=JxmRjh-S2Ms&t=1050s - notably 17:34 onwards - but I want my code to work for multiple websites at the press of a button.
I have tried the following code
Public Sub Auto_FX_update_BMG()
Application.ScreenUpdating = False 'My computer is not very fast, thus I use this line of
'code to save some computing power and time
Dim internet_object As InternetExplorer
Dim i As Integer
For i = 3 To Sheets(1).Cells(3, 7).End(xlDown).Row
FX_Pair = Sheets(1).Cells(i, 7)
Set internet_object = New InternetExplorer
internet_object.Visible = True
internet_object.navigate "https://www.bloomberg.com/quote/" & FX_Pair & ":CUR"
Application.Wait Now + TimeValue("00:00:05")
internet_object.document.getElementsByClassName("class")(0).getElementsByTagName ("value__b93f12ea") '--> DEBUGGER PROBLEM
'My goal here is to "grab" the PREV CLOSE
'value from the website
With ActiveSheet
.Range(Cells(i, 9)).Value = HTML_element.Children(0).textContent
End With
Sheets(1).Range(Cells(i, 9)).Copy 'Not sure if these 2 lines are unnecesary
ActiveSheet.Paste
Next i
Application.ScreenUpdating = True
End Sub
Expected Result
WHEN I enter "USDGBP" on a cell on column G:G, the macro would go to https://www.bloomberg.com/quote/EURGBP:CUR and "grab" the PREV CLOSE value of 0.8732 (using today's value) and insert it in the respective row of column I:I
As of now, I am just facing the debugger without much idea on how to solve the problem.
You can use class selectors in a loop. The pattern
.previousclosingpriceonetradingdayago .value__b93f12ea
specifies to get child elements with class value__b93f12ea having parent with class previousclosingpriceonetradingdayago. The "." in front is a css class selector and is a faster way of selecting as modern browsers are optimized for css. The space between the two classes is a descendant combinator. querySelector returns the first match for this pattern from the webpage html document.
This matches on the page:
You can see the parent child relationship and classes again here:
<section class="dataBox previousclosingpriceonetradingdayago numeric">
<header class="title__49417cb9"><span>Prev Close</span></header>
<div class="value__b93f12ea">0.8732</div>
</section>
N.B. If you are a Bloomberg customer look into their APIs. Additionally, it is very likely you can get this same info from other dedicated APIs which will allow for much faster and more reliable xhr requests.
VBA (Internet Explorer):
Option Explicit
Public Sub test()
Dim pairs(), ws As Worksheet, i As Long, ie As Object
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set ie = CreateObject("InternetExplorer.Application")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With ie
.Visible = True
For i = LBound(pairs) To UBound(pairs)
.Navigate2 "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
While .Busy Or .readyState < 4: DoEvents: Wend
results(i) = .document.querySelector(".previousclosingpriceonetradingdayago .value__b93f12ea").innerText
Next
.Quit
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
For very limited numbers of requests (as leads to blocking) you could use xhr request and regex out the value. I assume pairs are in sheet one and start from G2. I also assume there are no empty cells or invalid pairs in column G up to an including last pair to search for. Otherwise, you will need to develop the code to handle this.
Try regex here
Option Explicit
Public Sub test()
Dim re As Object, pairs(), ws As Worksheet, i As Long, s As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set re = CreateObject("VBScript.RegExp")
With ws
pairs = Application.Transpose(.Range("G2:G" & .Cells(.rows.Count, "G").End(xlUp).Row).Value) ' assumes pairs start in row 2
End With
Dim results()
ReDim results(1 To UBound(pairs))
With CreateObject("MSXML2.XMLHTTP")
For i = LBound(pairs) To UBound(pairs)
.Open "GET", "https://www.bloomberg.com/quote/" & pairs(i) & ":CUR", False
.send
s = .responseText
results(i) = GetCloseValue(re, s, "previousClosingPriceOneTradingDayAgo%22%3A(.*?)%2")
Next
End With
ws.Cells(2, "I").Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
Public Function GetCloseValue(ByVal re As Object, inputString As String, ByVal pattern As String) As String 'https://regex101.com/r/OAyq30/1
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
If .test(inputString) Then
GetCloseValue = .Execute(inputString)(0).SubMatches(0)
Else
GetCloseValue = "Not found"
End If
End With
End Function
Try below code:
But before make sure to add 2 reference by going to Tools> References > then look for Microsoft HTML Object Library and Microsoft Internet Controls
This code works upon using your example.
Sub getPrevCloseValue()
Dim ie As Object
Dim mySh As Worksheet
Set mySh = ThisWorkbook.Sheets("Sheet1")
Dim colG_Value As String
Dim prev_value As String
For a = 3 To mySh.Range("G" & Rows.Count).End(xlUp).Row
colG_Value = mySh.Range("G" & a).Value
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate "https://www.bloomberg.com/quote/" & colG_Value & ":CUR"
Do While ie.Busy: DoEvents: Loop
Do Until ie.readyState = 4: DoEvents: Loop
'Application.Wait (Now + TimeValue("00:00:03")) 'activate if having problem with delay
For Each sect In ie.document.getElementsByTagName("section")
If sect.className = "dataBox previousclosingpriceonetradingdayago numeric" Then
prev_value = sect.getElementsByTagName("div")(0).innerText
mySh.Range("I" & a).Value = prev_value
Exit For
End If
Next sect
Next a
I have a video tutorial for basic web automation using vba which include web data scraping and other commands, please check the link below:
https://www.youtube.com/watch?v=jejwXID4OH4&t=700s

Scraping data from a website with a dynamic array function - Excel VBA

I want to eventually create a function where I can specify a web page element and URL and populate all instances of that element down a column. But am currently only experiencing limited success with this function:
Sub GrabAnchorTags() '(URL As String) As Variant'
Dim objIE As InternetExplorer
Dim elem As Object
Set objIE = New InternetExplorer
objIE.Visible = False
objIE.navigate "http://example.com/"
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim aRange As Range
Debug.Print objIE.document.getElementsByTagName("a").Length
For Each elem In objIE.document.getElementsByTagName("a")
Debug.Print elem
ActiveCell.Offset(x, y).Value = elem
ActiveCell.Offset(x, y + 1).Value = elem.textContent
x = x + 1
Next
objIE.Quit
Set objIE = Nothing
End Sub
I would like to be able to turn this successfully from a macro to a function.
Currently, it uses a for loop to populate the cells and I wonder if it's possible to accomplish the same thing using evaluate or something similar because the for loop is inefficient.
This function would need to live in a cell, reference a URL in another cell, and populate the cells bellow it with all elements of a type found on the page. I am currently working on the anchor tag.
Many other solutions I referenced used macros:
Scraping data from website using excel vba
Getting links url from a webpage excel vba
VBA – Web scraping with getElementsByTagName()
Generally speaking, whenever you have many cells to write to, you should enter the data into an internal array, and then write the entire array to the worksheet in one hit. However you seem to not want a macro/sub in your case.
If you wish it to take the worksheet formula approach for usability reasons, then the best way is to use a very powerful, but underused technique in Excel development.
A NAMED RANGE
Named ranges are Excels closest thing to getting an in-memory block of data, and then other simpler formulas can use the named range to get info from the Named Range.
A Named Range doesn't have to actually be a simple block of cells on a sheet. You can write your VBA formula as a Public formula, and then reference it in the Named Range.
Function getElems(url As String, tagName As String) As String()
Dim browser As New MSXML2.XMLHTTP60
Dim doc As MSHTML.HTMLDocument
With browser
.Open "GET", url, False
.send
If .readyState = 4 And .Status = 200 Then
Set doc = New MSHTML.HTMLDocument
doc.body.innerHTML = .responseText
Else
MsgBox "Error" & vbNewLine & "Ready state: " & .readyState & _
vbNewLine & "HTTP request status: " & .Status
End If
End With
Dim tag As MSHTML.IHTMLElement
Dim tags As MSHTML.IHTMLElementCollection
Set tags = doc.getElementsByTagName(tagName)
Dim arr() As String
Dim arrCounter As Long: arrCounter = 1
ReDim arr(1 To tags.Length, 1 To 2)
For Each tag In tags
arr(arrCounter, 1) = tag.innerText
'Change the below if block to suit
If tagName = "a" Then
arr(arrCounter, 2) = tag.href
Else
arr(arrCounter, 2) = tag.innerText
End If
arrCounter = arrCounter + 1
Next tag
Set doc = Nothing
Set browser = Nothing
getElems = arr
End Function
Now set a Named Range in Excel such as:
elementData
=getElems(Sheet1!$A$1, Sheet1!$B$1)
In A1, put the URL, and in B1 put the tag Name such as "a"
Then in your cells you can say
=INDEX(elementData, ROW(1:1), 1) and in adjacent cell put =INDEX(elementData, ROW(1:1), 2) (or use ROWS formula technique)
and drag down.

Permission denied when trying to draw data from a table in IE

I have just recently started looking at applications of VBA in Excel accessing web pages through IE, and have no experience with html coding, so the solution to this might be really simple...
I have a section of code (below) that is supposed to navigate to a website, access a table and pull out the data into excel. However, at seemingly random times, for no reason that I can determine, the Object Variable 'TDelement' becomes locked somehow, and Excel throws up an Error 70: Permission Denied when I try to access the next cell through the loop. It doesn't happen all the time, and it doesn't happen on the same table cell.
Dim IE As Object
Dim TDElements As Object
Dim TDelement As Object
Dim Web_Address As String
Dim DteTm As Date
Web_Address = "http://www.bom.gov.au/fwo/IDQ65388/IDQ65388.040762.tbl.shtml"
' Access the Webpage
IE.Navigate Web_Address
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
' Find and Set Data Table Cells/object within webpage
Set TDElements = IE.document.GetElementsByTagName("td")
' Pull each TDElement (table cell) from TDElements
Rw = 1
Col = 2
For Each TDelement In TDElements
If Col = 1 Then
Col = 2
ElseIf Col = 2 Then
Col = 1
End If
If Col = 1 Then
DteTm = TDelement.innerText
Worksheets(1).Cells(Rw, Col).Value = DteTm
ElseIf Col = 2 Then
Worksheets(1).Cells(Rw, Col).Value = TDelement.innerText
End If
If Col = 2 Then
Rw = Rw + 1
End If
Next
If the error is going to occur within a cycle of the loop, it occurs on either
DteTm = TDelement.innerText or
Worksheets(1).Cells(Rw, Col).Value = TDelement.innerText,
dependant on the outcome of the If...Then statement, obviously.
After a bit of googling, the general concensus seemed to be that error 70 is related to naming conflicts with variables (ie trying to use the same variable name twice). Because of this I tried adding Set TDelement = Nothing before Next to clear the variable at the end of each loop, but it didn't resolve the issue (not all that surprising; I have never had an issue with variables in loops like this before).
Could it have something to do with .innerText? Even though it is mentioned on just about every forum post that I have seen with regards to pulling data from IE, it isn't mentioned in the Excel help files at all...
Any help on this would be greatly appreciated.
Try below code :
Sub sample()
Dim IE As Object
Dim Web_Address As String
Dim tblTR As Object
Dim tblTD As Object
Set IE = CreateObject("internetexplorer.application")
Web_Address = "http://www.bom.gov.au/fwo/IDQ65388/IDQ65388.040762.tbl.shtml"
' Access the Webpage
IE.Navigate Web_Address
IE.Visible = True
Start:
' Wait while IE loading...
Do While IE.Busy
Application.Wait DateAdd("s", 5, Now)
Loop
' Find and Set Data Table Cells/object within webpage
Set tblTR = IE.document.GetElementsByTagName("tr")
If tblTR Is Nothing Then GoTo Start
Dim i As Integer
i = 1
For Each tblTD In tblTR
If Not tblTD Is Nothing Then
Worksheets(1).Cells(i, 1).Value = tblTD.all(0).innerText
Worksheets(1).Cells(i, 2).Value = tblTD.all(1).innerText
End If
i = i + 1
Next
End Sub

Resources