using VBA to open webpage and pull specific line of data then place it in cell in excel - excel

what I am trying to do is use VBA code, or any way actually to lookup a website and pull a CPU model found by using the Id I assume. I then need it to take that CPU model and Paste it into the Corresponding Cell in excel.
here are my examples.
I need the website https://partsurfer.hpe.com/Search.aspx?SearchText= to be pulled up with the cell from S/N to be appended right after the =
it then pulls a website with the hardware information. and line 33 is the cpu model. I need that populated into the CPU cell corresponding to the Serial number.
a serial number to test with MXQ2040F21

Here's a version that combines the cell iteration and the more robust web fetch from my prior two attempts.
Sub get_computer_data()
' be sure to set these two constants and a variable
Dim SheetName As String
SheetName = "computers" ' name of the sheet that has the computer data
Dim serialNumCol As Byte
serialNumCol = 4 ' the number of the column that has the serial number
Dim r As Long: r = 2 ' the row the computer data starts on
Dim s As Worksheet
Set s = ThisWorkbook.Worksheets(SheetName) 'process a specific sheet
' process all rows of contiguous data
Do Until s.Cells(r, 1).Value = ""
s.Cells(r, serialNumCol + 2).Value = get_processor(s.Cells(r, serialNumCol).Value)
r = r + 1
Loop
End Sub
Function get_processor(serial_number) As String
Dim position As Long
Dim search As String
Dim processor As String
Dim html As String
Const url = "https://partsurfer.hpe.com/Search.aspx?SearchText="
Dim xmlhttp As Object
Set xmlhttp = CreateObject("MSXML2.serverXMLHTTP")
xmlhttp.Open "POST", url & serial_number, False
xmlhttp.Send
html = xmlhttp.responseText
' find the tag that idenifies the processor in the html
search = "ctl00_BodyContentPlaceHolder_gridCOMBOM_ctl34_lblpartdesc1"">"
position = InStr(1, html, search) + Len(search)
if position = 0 then
get_processor = "not found"
else
processor = Split(Mid(html, position), "<")(0)
get_processor = processor
end iff
End Function

Here's an approach to get get the data you are after using a more direct method than the web query method in my other answer. This function takes a serial number and returns the processor.
Sub test()
Debug.Print get_processor("MXQ2040F21")
End Sub
Function get_processor(serial_number) As String
Dim position As Long
Dim search As String
Dim processor As String
Dim html As String
Const url = "https://partsurfer.hpe.com/Search.aspx?SearchText="
Dim xmlhttp As Object
Set xmlhttp = CreateObject("MSXML2.serverXMLHTTP")
xmlhttp.Open "POST", url & serial_number, False
xmlhttp.Send
html = xmlhttp.responseText
' find the tag that idenifies the processor in the html
search = "ctl00_BodyContentPlaceHolder_gridCOMBOM_ctl34_lblpartdesc1"">"
position = InStr(1, html, search) + Len(search)
processor = Split(Mid(html, position), "<")(0)
get_processor = processor
End Function

The code below uses Excel's built-in "web query" feature to pull the as you have described. This code brings in the value from the 33rd row of the table that comes back from the web request. However, I'm skeptical that the CPU data will always be on the 33rd row and I don't see a way based on results from the webserver to infer what the right line is. Hopefully, this gets you headed in the right direction.
Sub get_computer_data()
' be sure to set these two constants and a variable
Dim SheetName as String
SheetName = "computers" ' name of the sheet that has the computer data
Dim serialNumCol as Byte
serialNumCol = 5 ' the number of the column that has the serial number
Dim r As Long: r = 2 ' the row the computer data starts on
Dim url as String
url = "https://partsurfer.hpe.com/Search.aspx?SearchText="
Dim s As Worksheet ' a reference
Dim query As Worksheet ' a variable to refer to the sheet created by the web query
Dim cell As Range ' a range object used to find data in the query result
Set s = ThisWorkbook.Worksheets(SheetName) 'process a specific sheet
' process all rows of contiguous data
Do Until s.Cells(r, 1).Value = ""
'perform a web query for the current serial number
Set query = CreateWebQuery(url & s.Cells(r, serialNumCol).Value, xlAllTables)
' find the data on the result page
Set cell = query.Cells.Find("Part Description", , , xlWhole)
If cell Is Nothing Then
s.Cells(r, serialNumCol + 2).Value = "No Data"
Else
s.Cells(r, serialNumCol + 2).Value = cell.Offset(33)
End If
r = r + 1
Loop
End Sub
Function CreateWebQuery(url As String, Optional WebSelectionType As XlWebSelectionType = xlEntirePage, Optional SaveQuery As Boolean, Optional PlainText As Boolean = True, Optional SheetName As String = "webQuery") As Worksheet
'*********************************************************************************'
' Builds a web-query object to retrieve information from a web server and
' returns a reference to a worksheet containing the data
'
' Parameters:
'
'
' URL
' The webpage to get. Should start with "http"
'
' WebSelectionType (xlEntirePage or xlAllTables)
' what part of the page should be brought back to Excel.
'
' SaveQuery (True or False)
' Indicates if the query object remains in the workbook after running
'
' PlainText (True or False)
' Indicates if the query results should be plain or include formatting
'
' SheetName
' Indicates the name of the sheet to create or use
'
'*********************************************************************************'
Dim outsheet As Worksheet
Dim s As Worksheet
Set s = ActiveSheet
On Error Resume Next
Set outsheet = ThisWorkbook.Worksheets(SheetName)
If Err.Number = 0 Then
outsheet.Cells.Clear
Else
Set outsheet = ThisWorkbook.Worksheets.Add
outsheet.Name = SheetName
End If
On Error GoTo 0
s.Activate
With outsheet.QueryTables.Add(Connection:="URL;" & url, Destination:=outsheet.Range("a1"))
.Name = "WebQuery"
.RefreshStyle = xlOverwriteCells
.WebSelectionType = WebSelectionType
.PreserveFormatting = PlainText
.BackgroundQuery = False
.Refresh
If Not SaveQuery Then .Delete
End With
Set CreateWebQuery = outsheet
End Function

Related

VBA for each loop with 64K+ ListRows (ouf of memory)

I'm running a VBA for each loop through an Excel table (Listobject) which checks if a file exists based on a given path. My table has expanded though and has 68K Listrows. After launching the code, it quickly gives an error Run-time-error '7': Out of memory
It runs OK with 63K lines (done within 5 minutes) and based on googling there appears to be something called "64K segment boundary". Is this what's affecting my code to run since it really feels like it buffers the row count at first and then bounces back w/o starting to actually run anything. Is there an easy workaround for this without the need to split up my dataset into multiple batches? Frankly, I was quite surprised that 64K limits would still be a thing in Excel in 2021.
Running it on 64bit Excel 2019, but no luck with Office365 either.
Sub CheckFiles()
Dim Headers As ListObject
Dim lstrw As ListRow
Dim strFileName As String
Dim strFileExists As String
Application.ScreenUpdating = False
Set ws = ThisWorkbook.Sheets("Import")
Set Headers = ws.ListObjects("Import")
For Each lstrw In Headers.ListRows
strFileName = lstrw.Range(7)
strFileExists = Dir(strFileName)
If strFileExists = "" Then
lstrw.Range(4) = "not found"
Else
lstrw.Range(4) = "exists"
End If
Next lstrw
Set ws = Nothing
Set Headers = Nothing
Application.ScreenUpdating = True
End Sub
Avoid Accessing the Worksheet
Since you cannot avoid looping, you better do it in the computer's memory, i.e. rather through the elements of an array than through the cells of a range.
The code is still slow, about 10s for 200k rows on my machine, but that's because of Dir.
Note how easy (one line only, when the range contains more than one cell) and how fast (a split second) it is to write (copy) a range to an array (Data = rg.Value) and write (copy) the array back to a range (rg.Value = Data).
Adjust the values in the constants section.
Option Explicit
Sub CheckFiles()
Const wsName As String = "Import" ' Worksheet Name
Const tblName As String = "Import" ' Table Name
Const cCol As Long = 7 ' Criteria Column
Const dCol As Long = 4 ' Destination Column
Dim wb As Workbook: Set wb = ThisWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets(wsName)
Dim Headers As ListObject: Set Headers = ws.ListObjects(tblName)
Dim Data As Variant ' Data Array
With Headers.ListColumns(cCol).DataBodyRange
If .Rows.Count = 1 Then
ReDim Data(1 To 1, 1 To 1): Data = .Value
Else
Data = .Value
End If
End With
Dim r As Long ' Array Row Counter (DataBodyRange Row Counter)
Dim FileName As String ' File Name Retrieved by Dir
For r = 1 To UBound(Data, 1)
FileName = Dir(CStr(Data(r, 1)))
If Len(FileName) = 0 Then
Data(r, 1) = "not found"
Else
Data(r, 1) = "exists"
End If
Next r
Headers.ListColumns(dCol).DataBodyRange.Value = Data
End Sub
Thank you all! A few takeaways. While obviously trying to write as efficient code as possible, any reasonable performance here is acceptable. With that said, for each loop took approx 5 minutes to run with 63K lines, meawhile it was done in about 15 seconds by the code I accepted as an answer by #VBasic2008 - without capacity problems either.
The only problem I had with this particular code was it being somewhat new approach for me, so possibly building on it in the future needs some dedication in looking deeper into it - but it sure looks efficient. I also put together a regular for ... to loop which also didn't run into problems with 68K lines and would steer between rows and columns with offset function.
Clearly faster than for each as #Pᴇʜ suggested but took approx 2x as long as the array method (30 seconds or so).
Sub CheckFiles_2()
Dim strFileName, strFileExists As String
Dim ws As Worksheet
Dim Headers As ListObject
Dim result As String
Dim counter, RowCount As Long
Application.ScreenUpdating = False
Set ws = ThisWorkbook.Sheets("Import")
Set Headers = ws.ListObjects("Import")
RowCount = Headers.ListRows.Count
For counter = 1 To RowCount
strFileName = Range("anchorCell").Offset(counter, 3)
strFileExists = Dir(strFileName)
If strFileExists = "" Then
result = "not found"
Else
result = "exists"
End If
Range("anchorCell").Offset(counter, 0) = result
Next counter
Set ws = Nothing
Set Headers = Nothing
Application.ScreenUpdating = True
End Sub

Bulk Email Extractor. Need to add a Regxp - excel VBA

I am trying to update my first code with parts of a second code, but have been struggling for a few weeks now. Both codes extract emails from a url list.
THE FIRST CODE
This code is fine apart from it uses Mailto: rather than a Regxp. I am trying to replace the Mailto: in the first code with a Regxp from the second code as the regxp extracts more emails.
How the First code works.
A list of urls are placed in Sheet2 "Urls" and the results are show in Sheet1 "Results". This code will extract ALL emails from the site, so if there are 10 emails it will extract ALL 10, if 100 then it will extract All 100 EMAIL. The SECOND code only extracts 1 email per site.
The problem with the FIRST CODE is that the Mailto: does MISS a lot of email where as the REGXP captures more, and so I want to update the first code. However I have been struggling for a few weeks now as I am not super hot in writing code. The first code was originaly written by me and then updated by another developer and his code is wayout of my skill depth. I have been trying for weeks to update it but can not seem to work anything out so I decided to post.
I have listed the first code below. You can download a sample workbook from here Download Link First code
Link to my second code that I also wrote, and the EMAIL REGXP i am trying to use, My Post StackOver Flow
FIRST CODE, Bulk Email Extractor. ONLY EMAIL PART NEEDS UPDATING WITH REGXP.
Sub ScrapeSoMeAndMailAddresses()
'Columns for both tables
Const colUrl As Long = 1 'Must always be the first column
Const colMail As Long = 2 'Must always be the first column before Some platforms
Const colFacebook As Long = 3 'Must always be the last column of Some platforms
Const colError As Long = 4 'Must always be the last column
Dim url As String
Dim http As Object
Dim htmlDoc As Object
Dim nodeAllLinks As Object
Dim nodeOneLink As Object
Dim pageLoadSuccessful As Boolean
Dim tableUrlsOneAddressLeft As String
Dim tableAllAddresses As String
Dim currentRowTableUrls As Long
Dim lastRowTableUrls As Long
Dim currentRowsTableAll(colUrl To colFacebook) As Long
Dim lastRowTableAll As Long
Dim addressCounters(colMail To colFacebook) As Long
Dim checkCounters As Long
'Initialize variables
tableUrlsOneAddressLeft = "Urls" ''Name of Sheet
currentRowTableUrls = 2 'First row for content
tableAllAddresses = "Results" ''Name of Sheet
For checkCounters = colUrl To colFacebook
currentRowsTableAll(checkCounters) = 2 'First rows for content
Next checkCounters
Set htmlDoc = CreateObject("htmlfile")
Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
'Clear all contents and comments in the URL source sheet from email column to error column
With Sheets(tableUrlsOneAddressLeft)
lastRowTableUrls = .Cells(Rows.Count, colUrl).End(xlUp).Row
.Range(.Cells(currentRowTableUrls, colMail), .Cells(lastRowTableUrls, colError)).ClearContents
.Range(.Cells(currentRowTableUrls, colMail), .Cells(lastRowTableUrls, colError)).ClearComments
End With
'Delete all rows except headline in the sheet with all addresses
lastRowTableAll = Sheets(tableAllAddresses).Cells(Rows.Count, colUrl).End(xlUp).Row
Sheets(tableAllAddresses).Rows(currentRowsTableAll(colUrl) & ":" & lastRowTableAll).Delete Shift:=xlUp
'Loop over all URLs in column A in the URL source sheet
Do While Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, 1).Value <> ""
'Scroll for visual monitoring, if 'the sheet with the URLs are the
'active one
If ActiveSheet.Name = tableUrlsOneAddressLeft Then
If currentRowTableUrls > 14 Then
ActiveWindow.SmallScroll down:=1
End If
Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, 1).Select
End If
'Get next url from the URL source sheet
url = Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, colUrl).Value
'Try to load page 'Temporarily disable error handling if 'there is a timeout or onother error
On Error Resume Next
http.Open "GET", url, False
http.send
'Check if page loading was successful
If Err.Number = 0 Then
pageLoadSuccessful = True
End If
On Error GoTo 0
If pageLoadSuccessful Then
'Build html document for DOM operations
htmlDoc.body.innerHtml = http.responseText
'Create node list from all links of the page
Set nodeAllLinks = htmlDoc.getElementsByTagName("a")
'Walk through all links of the node list
For Each nodeOneLink In nodeAllLinks
'''#####################################################################################################
'''################################### THIS IS THE START OF THE EMAIL SECTION ##########################
'''#####################################################################################################
'Check for mail address
If InStr(1, nodeOneLink.href, "mailto:") Then
'Write mail address to both tables
Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, colMail).Value = Right(nodeOneLink.href, Len(nodeOneLink.href) - InStr(nodeOneLink.href, ":"))
Sheets(tableAllAddresses).Cells(currentRowsTableAll(colMail), colMail).Value = Right(nodeOneLink.href, Len(nodeOneLink.href) - InStr(nodeOneLink.href, ":"))
'Check if it is a new line in the sheet with all addresses
If currentRowsTableAll(colMail) >= currentRowsTableAll(colUrl) Then
'Write URL in the new line of the sheet with all addresses
Sheets(tableAllAddresses).Cells(currentRowsTableAll(colUrl), colUrl).Value = url
'Increment url counter
currentRowsTableAll(colUrl) = currentRowsTableAll(colUrl) + 1
End If
'Increment mail counters
currentRowsTableAll(colMail) = currentRowsTableAll(colMail) + 1
addressCounters(colMail) = addressCounters(colMail) + 1
End If
'''#####################################################################################################
'''################################### END OF THE EMAIL SECTION ########################################
'''#####################################################################################################
'Check for Facebook address
If InStr(1, UCase(nodeOneLink.href), "FACEBOOK") Then
'Write Facebook address to both tables
Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, colFacebook).Value = nodeOneLink.href
Sheets(tableAllAddresses).Cells(currentRowsTableAll(colFacebook), colFacebook).Value = nodeOneLink.href
'Check if it is a new line in the sheet with all addresses
If currentRowsTableAll(colFacebook) >= currentRowsTableAll(colUrl) Then
'Write URL in the new line of the sheet with all addresses
Sheets(tableAllAddresses).Cells(currentRowsTableAll(colUrl), colUrl).Value = url
'Increment url counter
currentRowsTableAll(colUrl) = currentRowsTableAll(colUrl) + 1
End If
'Increment Facebook counters
currentRowsTableAll(colFacebook) = currentRowsTableAll(colFacebook) + 1
addressCounters(colFacebook) = addressCounters(colFacebook) + 1
End If
Next nodeOneLink
'Check address counters
For checkCounters = colMail To colFacebook
'Set comment if more than 1 link were found
If addressCounters(checkCounters) > 1 Then
Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, checkCounters).AddComment Text:=CStr(addressCounters(checkCounters))
Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, checkCounters).Comment.Shape.TextFrame.AutoSize = True
End If
Next checkCounters
Else
'Page not loaded
'Write message URL table
Sheets(tableUrlsOneAddressLeft).Cells(currentRowTableUrls, colError).Value = "Error with URL or timeout"
End If
'Prepare for next page
pageLoadSuccessful = False
Erase addressCounters
lastRowTableAll = Sheets(tableAllAddresses).Cells(Rows.Count, colUrl).End(xlUp).Row
For checkCounters = colUrl To colFacebook
currentRowsTableAll(checkCounters) = lastRowTableAll + 1 'First rows for next page content
Next checkCounters
currentRowTableUrls = currentRowTableUrls + 1
Loop
'Clean up
Set http = Nothing
Set htmlDoc = Nothing
Set nodeAllLinks = Nothing
Set nodeOneLink = Nothing
End Sub
As always thanks in advance.
I recommend you save a copy of your workbook before running the code in my answer. Hopefully it works and does what you need it to, I wasn't able to test it fully.
The GetEmailAddressesFromHtml function in the code below extracts email addresses using the regular expression you've included in your answer.
I think one thing the code below doesn't do (which your code did) was add comments to the Urls worksheet. But you could set up an Excel formula (e.g. COUNTIFS) to do that in my opinion.
I only add unique Facebook URLs and email addresses, so you shouldn't see any duplicates on the Results sheet.
Option Explicit
Option Private Module 'This option means that Subs of this module are not displayed in the macros of the Excel GUI
' https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/raise-method
Private Const ERR_REQUEST_FAILED As Long = 513
Private Const ERR_INVALID_HTML As Long = 514
Private Sub PrepareSourceSheet( _
ByVal someSheet As Worksheet, ByVal firstRowToClear As Long, ByVal lastRowToClear As Long, _
ByVal firstColumnToClear As Long, ByVal lastColumnToClear As Long)
' Should clear all contents and comments in the source sheet from email column to error column
With someSheet
With .Range(.Cells(firstRowToClear, firstColumnToClear), .Cells(lastRowToClear, lastColumnToClear))
Debug.Assert Intersect(.Cells, .Parent.Columns(1)) Is Nothing
.ClearContents
.ClearComments
End With
End With
End Sub
Private Sub PrepareDestinationSheet(ByVal someSheet As Worksheet, ByVal firstRowToDelete As Long)
'Should delete all rows starting from and including "firstRowToDelete".
With someSheet
.rows(firstRowToDelete & ":" & .rows.CountLarge).Delete Shift:=xlUp
End With
End Sub
Private Sub ScrapeSomeData()
'Columns for both tables
Const COLUMN_URL As Long = 1 'Must always be the first column
Const COLUMN_EMAIL As Long = 2 'Must always be the first column before Some platforms
Const COLUMN_FACEBOOK As Long = 3 'Must always be the last column of Some platforms
Const COLUMN_ERROR As Long = 4 'Must always be the last column
Const FIRST_SOURCE_ROW As Long = 2 ' Skip headers
Const FIRST_DESTINATION_ROW As Long = 2 ' Skip headers
Dim sourceSheet As Worksheet
Set sourceSheet = ThisWorkbook.Worksheets("Urls")
Dim destinationSheet As Worksheet
Set destinationSheet = ThisWorkbook.Worksheets("Results")
Dim lastSourceRow As Long
lastSourceRow = sourceSheet.Cells(sourceSheet.rows.Count, COLUMN_URL).End(xlUp).Row
PrepareSourceSheet someSheet:=sourceSheet, firstRowToClear:=FIRST_SOURCE_ROW, _
lastRowToClear:=lastSourceRow, firstColumnToClear:=COLUMN_EMAIL, lastColumnToClear:=COLUMN_ERROR
PrepareDestinationSheet someSheet:=destinationSheet, firstRowToDelete:=FIRST_DESTINATION_ROW
Dim destinationRowIndex As Long
destinationRowIndex = FIRST_DESTINATION_ROW
Dim sourceRowIndex As Long
For sourceRowIndex = FIRST_SOURCE_ROW To lastSourceRow
Dim data As Collection
Set data = GetDataForUrl(sourceSheet.Cells(sourceRowIndex, COLUMN_URL))
With destinationSheet
Dim currentRowData As Variant
For Each currentRowData In data
.Cells(destinationRowIndex, COLUMN_URL).Value = currentRowData("url")
.Cells(destinationRowIndex, COLUMN_EMAIL).Value = currentRowData("emailAddress")
.Cells(destinationRowIndex, COLUMN_FACEBOOK).Value = currentRowData("facebookUrl")
.Cells(destinationRowIndex, COLUMN_ERROR).Value = currentRowData("errorMessage")
destinationRowIndex = destinationRowIndex + 1
Next currentRowData
End With
With sourceSheet
.Cells(sourceRowIndex, COLUMN_EMAIL).Value = data(1)("emailAddress")
.Cells(sourceRowIndex, COLUMN_FACEBOOK).Value = data(1)("facebookUrl")
.Cells(sourceRowIndex, COLUMN_ERROR).Value = data(1)("errorMessage")
End With
DoEvents
Next sourceRowIndex
End Sub
Private Function GetHtmlFromUrl(ByVal someUrl As String) As Object
' Should return a HTML document. Raises an error if URL is unavailable
' (at the time of requesting) or if HTML could not be assigned.
Dim httpClient As Object
Set httpClient = CreateObject("MSXML2.ServerXMLHTTP.6.0")
Dim succeeded As Boolean
httpClient.Open "GET", someUrl, False
On Error Resume Next
httpClient.Send
succeeded = (0 = Err.Number)
On Error GoTo 0
If Not succeeded Then Err.Raise ERR_REQUEST_FAILED, , "Error with URL or timeout"
Dim htmlDocument As Object
Set htmlDocument = CreateObject("htmlfile")
On Error Resume Next
htmlDocument.body.innerHTML = httpClient.responseText
succeeded = (0 = Err.Number)
On Error GoTo 0
If Not succeeded Then Err.Raise ERR_INVALID_HTML, , "Error whilst assigning HTML"
Set GetHtmlFromUrl = htmlDocument
End Function
Private Function GetFacebookUrlsFromHtml(ByVal htmlDocument As Object) As Collection
' Should return a collection of strings that are Facebook URLs detected.
' This function only looks within anchor tags.
Dim outputCollection As Collection
Set outputCollection = New Collection
Dim allAnchorTags As Object
Set allAnchorTags = htmlDocument.getElementsByTagName("a")
Dim anchorTag As Object
For Each anchorTag In allAnchorTags
If InStr(1, UCase$(anchorTag.href), "FACEBOOK", vbBinaryCompare) > 0 Then
On Error Resume Next
outputCollection.Add anchorTag.href, Key:=anchorTag.href ' De-duplicate here
On Error GoTo 0
End If
Next anchorTag
Set GetFacebookUrlsFromHtml = outputCollection
End Function
Private Function GetEmailAddressesFromHtml(ByVal htmlDocument As Object) As Collection
' Should return a collection of strings representing email addresses detected
' in the HTML document.
Dim outputCollection As Collection
Set outputCollection = New Collection
Dim regEx As Object
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.Pattern = "[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
.Global = True
Dim emailMatches As Object
Set emailMatches = .Execute(htmlDocument.body.innerHTML)
End With
Dim matchFound As Object
For Each matchFound In emailMatches
On Error Resume Next ' De-duplicate here.
outputCollection.Add matchFound.Value, Key:=matchFound.Value
On Error GoTo 0
Next matchFound
Set GetEmailAddressesFromHtml = outputCollection
End Function
Private Function CreateRowDataForError(ByVal wasSuccess, ByVal errorMessage, ByVal someUrl As String) As Collection
' Context: An error has occurred and we don't have any data (Facebook URLs, email addresses).
' So can only return the URL attempted and the error message.
Dim nestedCollection As Collection
Set nestedCollection = New Collection
nestedCollection.Add wasSuccess, "wasSuccess"
nestedCollection.Add errorMessage, "errorMessage"
nestedCollection.Add someUrl, "url"
nestedCollection.Add vbNullString, "facebookUrl"
nestedCollection.Add vbNullString, "emailAddress"
Set CreateRowDataForError = New Collection
CreateRowDataForError.Add nestedCollection
Debug.Assert 1 = CreateRowDataForError.Count
End Function
Private Function CreateRowDataForResults(ByVal wasSuccess As Boolean, ByVal errorMessage As String, _
ByVal someUrl As String, ByVal facebookUrls As Collection, ByVal emailAddresses As Collection) As Collection
' Context: No error occurred. HTML document may or may not contain data,
' but logic below should handle both scenarios.
Dim nestedCollection As Collection
Dim outerCollection As Collection
Set outerCollection = New Collection
Dim i As Long
For i = 1 To Application.Max(1, facebookUrls.Count, emailAddresses.Count)
Set nestedCollection = New Collection
nestedCollection.Add wasSuccess, Key:="wasSuccess"
nestedCollection.Add errorMessage, Key:="errorMessage"
nestedCollection.Add someUrl, Key:="url"
nestedCollection.Add GetCollectionItemOrDefault(facebookUrls, i, vbNullString), Key:="facebookUrl"
nestedCollection.Add GetCollectionItemOrDefault(emailAddresses, i, vbNullString), Key:="emailAddress"
outerCollection.Add nestedCollection
Next i
Debug.Assert outerCollection.Count = Application.Max(1, facebookUrls.Count, emailAddresses.Count)
Set CreateRowDataForResults = outerCollection
End Function
Private Function GetDataForUrl(ByVal someUrl As String) As Collection
' Currently this function misuses Collection class. Should probably instead write a class to return a custom object/data structure.
' Returns a collection of nested collections, where each nested collection is as below:
' • "wasSuccess" = whether data was successfully retrieved
' • "errorMessage" = an error message mentioning what happened
' • "facebookUrl" = a Facebook URL detected
' • "emailAddress" = an email address detected
Dim wasSuccess As Boolean
Dim errorMessage As String
Dim htmlDocument As Object
On Error Resume Next
Set htmlDocument = GetHtmlFromUrl(someUrl)
wasSuccess = (0 = Err.Number)
If Not wasSuccess Then
errorMessage = IIf(ERR_REQUEST_FAILED = Err.Number Or ERR_INVALID_HTML = Err.Number, Err.Description, "Unexpected error occurred")
End If
On Error GoTo 0
If Not wasSuccess Then
Set GetDataForUrl = CreateRowDataForError(wasSuccess, errorMessage, someUrl)
Else
Dim facebookUrls As Collection
Set facebookUrls = GetFacebookUrlsFromHtml(htmlDocument)
Dim emailAddresses As Collection
Set emailAddresses = GetEmailAddressesFromHtml(htmlDocument)
Set GetDataForUrl = CreateRowDataForResults(wasSuccess, errorMessage, someUrl, facebookUrls, emailAddresses)
End If
End Function
Private Function GetCollectionItemOrDefault(ByVal someCollection As Collection, ByVal someKey As Variant, ByVal someDefaultValue As Variant) As Variant
' Assumes item is not an object. This function will return false negatives if item being retrieved is an object.
Dim succeeded As Boolean
On Error Resume Next
GetCollectionItemOrDefault = someCollection(someKey)
succeeded = (0 = Err.Number)
On Error GoTo 0
If Not succeeded Then GetCollectionItemOrDefault = someDefaultValue
End Function
From a maintenance and coding perspective, I think something like Node.js or Python would allow you to get the same work done in fewer lines of code.

Is there a way to slow down a Web Scraper so it will pick up the code?

I wrote a macro to go to WU to get historical data and for the most part, it works. However, I believe that the macro is running too fast for it to pick up the data from the website.
https://www.wunderground.com/history/daily/us/tx/el-paso/KELP/date/2017-1-3
Is the website and the table I want to get is tablesaw-sortable.
I have tried the following: DoEvents and Application.Wait (Now + TimeValue("00:00:01")) to try to slow down the process.
Sub BrowseToWU()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim RowAddress As Integer
Dim WebAddress As String
Dim DateSheet As Date
Dim WkDay As Integer
Dim DateSheetName As String
'Application.ScreenUpdating = False
'Application.StatusBar = True
RowAddress = 2
IE.Visible = True
Do Until RowAddress = 60
WebAddress = Range("A" & RowAddress)
DateSheet = Right(WebAddress, 8)
DateSheetName = Right(WebAddress, 8)
WkDay = Weekday(DateSheet, vbSunday)
If WkDay < 3 Then
RowAddress = RowAddress + 1
ElseIf WkDay > 6 Then
RowAddress = RowAddress + 1
Else
IE.Navigate WebAddress
Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.Document
DoEvents
Application.Wait (Now + TimeValue("00:00:05"))
DoEvents
ProcessHTMLPage HTMLDoc
DateSheet = Right(WebAddress, 8)
DoEvents
Application.Wait (Now + TimeValue("00:00:01"))
ActiveSheet.Name = DateSheetName
DoEvents
RowAddress = RowAddress + 1
'IE.Quit
Worksheets("Sheet1").Activate
End If
Loop
End Sub
Option Explicit
Sub ProcessHTMLPage(HTMLPage As MSHTML.HTMLDocument)
Dim HTMLTable As MSHTML.IHTMLElement
Dim HTMLTables As MSHTML.IHTMLElementCollection
Dim HTMLRow As MSHTML.IHTMLElement
Dim HTMLCell As MSHTML.IHTMLElement
Dim RowNum As Long, ColNum As Integer
'Dim IE As New SHDocVw.InternetExplorer
'Dim Ws As Worksheet
Set HTMLTables = HTMLPage.getElementsByClassName("tablesaw-sortable")
'DoEvents
For Each HTMLTable In HTMLTables
Worksheets.Add
DoEvents
Range("A1").Value = HTMLTable.className
Range("B1").Value = Now
RowNum = 2
For Each HTMLRow In HTMLTable.getElementsByTagName("tr")
'Debug.Print vbTab & HTMLRow.innerText
ColNum = 1
For Each HTMLCell In HTMLRow.Children
Cells(RowNum, ColNum) = HTMLCell.innerText
ColNum = ColNum + 1
Next HTMLCell
RowNum = RowNum + 1
Next HTMLRow
Next HTMLTable
DoEvents
'IE.Quit
End Sub
The macro is supposed to run through sheet1 picking up the web address to the historical data if it satisfies the criteria of being a certain day of the week.
IE will open and then it will kick over to the next module that will take in the data.
A new worksheet is created and the data pasted into the new worksheet.
The worksheet is renamed to the date of the data.
The web address sheet is activated again and the process starts over again.
The error I get is that the data isn't taken from the website, so the For statement ends and the web address sheet is renamed and an error occurs.
One way around this is to call the API that the page is using to get that info.
The API returns json which you can parse with a json parser. I use jsonconverter.bas. After installing the code from that link in a standard module called JsonConverter, go to VBE > Tools > References > Add a reference to Microsoft Scripting Runtime.
Finding the API:
If you press F12 to open developer tools and go to the Network tab and then press F5 to refresh any url of interest you will see the recorded web traffic. You can find the API call there.
See my answer here on how to search the network traffic using a specific observation value you expect to see in the response - this will filter the list of network traffic to those items containing the value of interest. Be sensible in selecting the value - you want something unlikely to occur elsewhere. You can also filter the network traffic to XHR only.
The API response:
The API returns json. More specifically, it returns a dictionary containing 2 keys. The second key, "observations", can be used to return a collection (denoted by []) of dictionaries (denoted by {}).
Each dictionary represents a row of the table (daily observations). You can loop this collection, and then loop the inner dictionaries, to access the table row values and reconstruct the table by populating an array. Explore example json response here.
Explanation of json structure:
click here to enlarge
Explanation of code:
The code is broken down into a number of helper subs and functions, allocating certains tasks to each, to
make code easier to debug and follow, as well as better align with Object Oriented Programming Principles.
Overall the process is:
Gather urls for Worksheet("Sheet1"). Helper function GetAllUrls.
Process those urls and only retain the dates which correspond with Tue-Thur. These are kept as strings formatted as "yyyymmdd" so can be passed to API later. This is handled by helper functions GetOnlyQualifyingUrlsDates and IncludeThisDate. IncludeThisDate performs the check for whether to include; GetOnlyQualifyingUrlsDates handles the looping and formatting of results.
Issue xmlhttp requests by looping over qualifying url dates and concatenating those into the url for the API call, then issuing the request. This is performed by the main sub GetTables.
Sheet creation, for output, is handled by helper function CreateWorksheet. This function calls another helper function, SheetExists, to ensure sheets are only created if they don't already exist, otherwise, the existing sheet by that name is used.
The resultant json response, from step 3, is passed to a helper sub WriteOutResults which accepts the json variable and the output sheet object as arguments. It extracts all the info from the json response; essentially reconstructing the table. It adds the table and headers to the appropriate sheet.
It calls helper function Epoch2Date, which handles the unix timestamp to datetime conversion for the two unix fields in the json object.
TODO:
The API key may be time limited. Add a helper function which returns the current valid key.
The API accepts start date and end date parameters in the url construct. It would be far better to issue one request for the entire range if possible, or chunked ranges e.g. months, to reduce the number of requests made. This would also reduce the likelihood of being blocked. This would mean some additional code would need to be written, before writing out results, to ensure only dates of interest are being written to sheets. Though you could write out all then simply loop all sheets and delete those that aren't wanted (perfectly doable if we are talking about 365 dates total). Personally, I would handle the include date part in the construction of the table from a single request (if possible) that has the min and max dates for entire urls listed passed as start and end date parameters. I would then write a single flat table out to one sheet as this will be much easier for later data analysis.
VBA:
Option Explicit
Public Sub GetTables()
'VBE > Tools > References > Microsoft Scripting Runtime
Dim json As Object, qualifyingUrlsDates(), urls(), url As String
Dim ws As Worksheet, wsOutput As Worksheet, i As Long, startDate As String, endDate As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
urls = GetAllUrls(2, ws, "A")
qualifyingUrlsDates = GetOnlyQualifyingUrlsDates(urls)
'API key may be not be valid over time so look at obtaining by prior request
With CreateObject("MSXML2.XMLHTTP") 'issue xmlhttp request for each valid date (this would be better done using start and enddate to specify entire range _
of batches e.g. months within total range to cut down on requests
For i = LBound(qualifyingUrlsDates) To UBound(qualifyingUrlsDates)
startDate = qualifyingUrlsDates(i)
endDate = startDate ' a little verbose but useful for explaining
url = "https://api.weather.com/v1/geocode/31.76/-106.49/observations/historical.json?apiKey=6532d6454b8aa370768e63d6ba5a832e&startDate=" & startDate & "&endDate=" & endDate & "&units=e"
.Open "GET", url, False
.send
Set json = JsonConverter.ParseJson(.responseText)("observations")
Set wsOutput = CreateWorksheet(qualifyingUrlsDates(i))
WriteOutResults wsOutput, json
Next
End With
End Sub
Public Sub WriteOutResults(ByVal wsOutput As Worksheet, ByVal json As Object)
'json is a collection of dictionaries. Each dictionary is a time period reading from the day i.e. one row in output
Dim results(), item As Object, headers(), r As Long, c As Long, key As Variant
headers = json.item(1).keys 'get the headers which are the keys of each dictionary
ReDim results(1 To json.Count, 1 To UBound(headers) + 1)
For Each item In json
r = r + 1: c = 0 'increase row in results array to store results for table row
For Each key In item.keys
c = c + 1 'increase column number in results array for writing out results
Select Case key
Case "valid_time_gmt", "expire_time_gmt" 'convert unix timestamp fields to datetime
results(r, c) = Epoch2Date(item(key))
Case Else
results(r, c) = item(key)
End Select
Next
Next
With wsOutput
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
End Sub
Public Function GetOnlyQualifyingUrlsDates(ByVal urls As Variant) As Variant
Dim i As Long, output(), counter As Long
ReDim output(1 To UBound(urls))
For i = LBound(urls) To UBound(urls)
If IncludeThisDate(urls(i)) Then 'check if weekday is to be included
counter = counter + 1
output(counter) = Format$(Right$(urls(i), 8), "yyyymmdd") 'if to include then add to output array of urls of interest
End If
Next
ReDim Preserve output(1 To counter)
GetOnlyQualifyingUrlsDates = output
End Function
Public Function IncludeThisDate(ByVal url As String) As Boolean
'tue, wed, thurs are valid
IncludeThisDate = Not IsError(Application.Match(Weekday(Right$(url, 8), vbSunday), Array(3, 4, 5)))
End Function
Public Function SheetExists(ByVal sheetName As String) As Boolean '<== function by #Rory
SheetExists = Evaluate("ISREF('" & sheetName & "'!A1)")
End Function
Public Function GetAllUrls(ByVal startRow As Long, ByVal ws As Worksheet, ByVal columnName As String) As Variant
'transpose used based on premise no more than a couple of years of dates
'startRow is start row for urls, ws is sheet where urls found, columnName is string representation of column for urls e.g. "A"
With ws
GetAllUrls = Application.Transpose(ws.Range("A" & startRow & ":A" & .Cells(.rows.Count, columnName).End(xlUp).Row).Value)
End With
End Function
Public Function CreateWorksheet(ByVal sheetName As String) As Worksheet
Dim ws As Worksheet
If SheetExists(sheetName) Then
Set ws = ThisWorkbook.Worksheets(sheetName)
'do something.... clear it? Then add new data to it?
Else
Set ws = ThisWorkbook.Worksheets.Add
ws.Name = sheetName
End If
Set CreateWorksheet = ws
End Function
Public Function Epoch2Date(ByVal E As Currency, Optional msFrac) As Date '# Schmidt http://www.vbforums.com/showthread.php?805245-EPOCH-to-Date-and-vice-versa
Const Estart As Double = #1/1/1970#
msFrac = 0
If E > 10000000000# Then E = E * 0.001: msFrac = E - Int(E)
Epoch2Date = Estart + (E - msFrac) / 86400
End Function

Inefficient UDF with Internet Explorer

The below UDF opens IE and returns the currency conversion rate from USD to the input (another currency ticker i.e. EUR, GBP, HKD, etc.) For instance, if the input was ConvertUSD(USD), the output would be 1 since 1USD = 1USD.
Using the equation once is fine, the issue im having is related to the way I intend to use the function. I need to build a table with Currency tickers spanning Col A (known values and will be text). Col B will then show the corresponding rows conversion rate. I intend to set B2 = ConvertUSD(A2), and then drag this down to the bottom row (roughly 48 currencies so ending row = B49). When I do this, 48 IE windows will be opened and closed which is not ideal, but I am unsure how to avoid this.
How to create this table with just one instance of IE being opened?
Public Function ConvertUSD(ConvertWhat As String) As Double
'References
' Microsoft XML, vs.0
' Microsoft Internet Controls
' Microsoft HTML Object Library.
Dim IE As New InternetExplorer
'IE.Visible = True
IE.Navigate "https://www.oanda.com/currency/converter?quote_currency=USD&base_currency=" & ConvertWhat
Do
DoEvents
Loop Until IE.ReadyState = ReadyState_Complete
Dim Doc As HTMLDocument
Set Doc = IE.Document
Dim Ans As String
Ans = Trim(Doc.getElementsByTagName("tbody")(2).innerText)
Dim AnsExtract As Variant
AnsExtract = Split(Ans, " ")
ConvertUSD = AnsExtract(4)
IE.Quit
End Function
I think a more efficient method would be to use one of the sites that provides api access to this kind of data. There are a number of both free and paid sites available. The routine below (which makes use of a free api) will download and write to a worksheet 170 foreign currencies in a fraction of a second and does not open ANY IE windows. For this download, I have specified USD as the base currency, but you can specify any base.
The output from the website is as a JSON, so a JSON parser will be of value. I used the free one available at:
VBA-JSON v2.2.3
(c) Tim Hall - https://github.com/VBA-tools/VBA-JSON
but there are others that run in VBA. Or you can write your own.
This also requires a reference to be set to Microsoft winHTTP Services, Version 5.1 (or you could use late binding)
Option Explicit
Sub latestForex()
Const app_id As String = "your_own_api_key"
Const sURL1 As String = "https://openexchangerates.org/api/latest.json"
Const sURL2 As String = "?app_id="
Const sURL3 As String = "&base=USD"
Dim sURL As String
Dim vRes As Variant, wsRes As Worksheet, rRes As Range
Dim v, w, i As Long
Dim httpRequest As WinHttpRequest
Dim strJSON As String, JSON As Object
sURL = sURL1 & sURL2 & app_id & sURL3
Set httpRequest = New WinHttpRequest
With httpRequest
.Open "Get", sURL
.send
.WaitForResponse
strJSON = .responseText
End With
Set httpRequest = Nothing
Set JSON = ParseJson(strJSON)
i = 0
ReDim vRes(0 To JSON("rates").Count, 1 To 2)
Set wsRes = Worksheets("sheet3")
Set rRes = wsRes.Cells(1, 1)
vRes(0, 1) = (JSON("timestamp") / 86400) + #1/1/1970# 'UTC time
vRes(0, 2) = JSON("base")
For Each v In JSON("rates")
i = i + 1
vRes(i, 1) = v
vRes(i, 2) = JSON("rates")(v)
Next v
Set rRes = rRes.Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
With rRes
.EntireColumn.Clear
.Value2 = vRes
.Cells(1, 1).NumberFormat = "dd-mmm-yyyy hh:mm"
.Columns(2).NumberFormat = "$0.0000"
.EntireColumn.AutoFit
End With
End Sub
Here is a portion of the results.
Note that the time stamp is UTC. Obviously you can change that to local time.
Don't use a UDF. Just use a sub/macro to refresh the whole list on demand.
Do it like this:
Sub RefreshCurrencyRates()
' Run this sub as a macro. Use a keyboard shortcut or a button to invoke it.
' You can even add a call to the sub in the Workbook_Open event if you like.
' This sub assumes that the relevant sheet is the active sheet. This will always be the case is you use a
' button placed on the sheet itself. Otherwise, you might want to add further code to specify the sheet.
'
' Best practice:
With Application
.ScreenUpdating = False
.Calculation = xlCalculationManual
.DisplayAlerts = False
.EnableEvents = False
End With
'
' The first thing you need to do is specify the range of rows which contain your currency codes.
' I'm hard-coding this here, but you can change it.
' As a first example, let's assume that you have the following currencies in cells A1-A4:
' A1 = GBP
' A2 = EUR
' A3 = HKD
' A4 = JPY
'
' So with rows 1-4, we'll do the following:
Dim RowNum As Long, CurCode As String
' Set up our Internet Explorer:
Dim IE As InternetExplorer
Set IE = New InternetExplorer
'
For RowNum = 1 To 4
CurCode = Cells(RowNum, 1).Value ' Takes the currency code from column A in each row
Cells(RowNum, 2).Value = ConvertUSD(CurCode, IE) ' Gets the relevant conversion and enters it into column B
Next RowNum
' Cleardown
IE.Quit
Set IE = Nothing
With Application
.ScreenUpdating = True
.Calculation = xlCalculationAutomatic
.DisplayAlerts = True
.EnableEvents = True
End With
End Sub
Public Function ConvertUSD(ByVal ConvertWhat As String, IE As InternetExplorer) As Double
'References
' Microsoft XML, vs.0
' Microsoft Internet Controls
' Microsoft HTML Object Library.
IE.Navigate "https://www.oanda.com/currency/converter?quote_currency=USD&base_currency=" & ConvertWhat
Do
DoEvents
Loop Until IE.ReadyState = ReadyState_Complete
Dim Doc As HTMLDocument
Set Doc = IE.Document
Dim Ans As String
Ans = Trim(Doc.getElementsByTagName("tbody")(2).innerText)
Dim AnsExtract As Variant
AnsExtract = Split(Ans, " ")
ConvertUSD = AnsExtract(4)
End Function

Problems calling Function with Argument (Need to pass File name from one function to another) Excel VBA

I am trying to integrate 2 functions.
I have one sub function which works to loop through all files one by one.
once it has identified the file name.
It should call the function to run, on the opened file.
I can not seem to find a way to pass this on,
I did some reading on calling functions with arguments but when i try this i get a "compile error seperate list or )"
Can you please point me in the right direction?
I have posted the code below:
Option Explicit
Option Base 1
Public Const DATASHEET As String = "MDFDATA"
Public Const TABLECONVERSIONSHEET As String = "TABLECONVERSION"
Public Const OPTIONSSHEET As String = "OPTIONS"
Public Const FinalSheet As String = "Final Sheet"
Public lByte_Order As Long 'byte order
Public lData_Groups As Long 'number of data groups
Public lChannel_Groups As Long 'number of channel groups
Public lChannels As Long 'number of channels
Public lTable_offset As Long 'row offset for the conversion table sheet
Sub OpenFiles()
Dim MyFolder As String
Dim MyFile As String
Dim sFile_Name As String 'MDF file name
Dim lFile_Number As Long 'file number
MyFolder = "C:\Users\Documents\Test"
sFile_Name = Dir(MyFolder & "\*.dat")
Do While sFile_Name <> ""
lFile_Number = FreeFile
Open sFile_Name For Binary Access Read Shared As lFile_Number
Call PARSE_MDF
Loop
End Sub
'==================================================================================================
' PARSE_MDF
' Main function
' Returns True if successful
'==================================================================================================
Function PARSE_MDF() As Boolean
Dim sFile_Name As String 'MDF file name
Dim lFile_Number As Long 'file number
Dim lData_Groups_Counter As Long 'data groups counter
Dim lChannel_Groups_Counter As Long 'channel groups counter
Dim lChannels_Counter As Long 'channels counter
Dim lRecords As Long 'number of records in data block
Dim lRecord_Length As Long 'length of record in data block
Dim lData_Group_Address As Long 'data group address
Dim lData_Address As Long 'data address
Dim lChannel_Group_Address As Long 'Channel group address
Dim lChannel_Address As Long 'Channel address
Dim byCol As Byte 'column counter for output
Dim wsData_Sheet As Worksheet 'main worksheet
Dim wsTable_Conversion_Sheet As Worksheet
Dim rFirst_Signal As Range 'first signal in channel group
Dim rLast_Signal As Range 'last signal in channel group
Dim rSignals As Range 'range of signal names for a channel group
Application.EnableEvents = False
lTable_offset = 0
Set wsTable_Conversion_Sheet = Workbooks(ActiveWorkbook.Name).Worksheets(TABLECONVERSIONSHEET)
Set wsData_Sheet = Workbooks(ActiveWorkbook.Name).Worksheets(DATASHEET)
'file selected
If sFile_Name <> "False" Then
'clear old data
wsData_Sheet.Columns.Clear
wsTable_Conversion_Sheet.Columns.ClearContents
'set headers
wsData_Sheet.Cells(1, 1).Value = "Signal name"
wsData_Sheet.Cells(2, 1).Value = "Data type"
wsData_Sheet.Cells(3, 1).Value = "Lsb"
wsData_Sheet.Cells(4, 1).Value = "Offset"
wsData_Sheet.Cells(5, 1).Value = "Bit length"
wsData_Sheet.Cells(6, 1).Value = "Formula ID"
wsData_Sheet.Cells(7, 1).Value = "Formula"
wsData_Sheet.Cells(8, 1).Value = "First Bit position"
wsData_Sheet.Cells(9, 1).Value = "Table length"
wsData_Sheet.Cells(10, 1).Value = "Start Row"
'offset columns because of headers
byCol = 2
'get file number
lFile_Number = FreeFile
'open file
'check file integrity
If IDBLOCK(lFile_Number) Then
'check data exists
If HDBLOCK(lFile_Number, lData_Group_Address) Then
'main iteration for data groups
For lData_Groups_Counter = 1 To lData_Groups
'check channel group exists
If DGBLOCK(lFile_Number, lData_Group_Address, lChannel_Group_Address, lData_Address) Then
'channel group iteration
For lChannel_Groups_Counter = 1 To lChannel_Groups
'get channel group data
Call CGBLOCK(lFile_Number, lChannel_Group_Address, lChannel_Address, lRecord_Length, lRecords)
'set the first signal range in this channel group
Set rFirst_Signal = wsData_Sheet.Cells(1, byCol)
'channels iteration
For lChannels_Counter = 1 To lChannels
'get channel data for each channel
Call CNBLOCK(lFile_Number, lChannel_Address, wsData_Sheet, byCol)
'excel fudge
If byCol <> 255 Then
byCol = byCol + 1
End If
Next 'lChannels_Counter
'set the last signal range in this channel group
Set rLast_Signal = wsData_Sheet.Cells(1, byCol - 1)
'format divider columns
wsData_Sheet.Columns(byCol).ColumnWidth = 5
wsData_Sheet.Columns(byCol).Interior.ColorIndex = 0
wsData_Sheet.Columns(byCol).Interior.Pattern = xlLightUp
wsData_Sheet.Columns(byCol).Interior.PatternColorIndex = xlAutomatic
'excel fudge
If byCol <> 255 Then
'for space between channels
byCol = byCol + 1
End If
Next 'lChannel_Groups_Counter
'get range of signals to get data for
Set rSignals = wsData_Sheet.Range(rFirst_Signal, rLast_Signal)
'get signal data
'no channel data in this data group
Else
PARSE_MDF = False
End If
Next 'lData_Groups_Counter
'no data in MDF file
Else
PARSE_MDF = False
End If
'not a MDF file
Else
PARSE_MDF = False
End If
'close file
Close #lFile_Number
'tidy up sheet
wsData_Sheet.Rows.EntireRow.AutoFit
wsData_Sheet.Columns.EntireColumn.AutoFit
wsData_Sheet.Rows("2:15").EntireRow.Delete
wsData_Sheet.Columns("A:A").EntireColumn.Delete
wsData_Sheet.Cells.HorizontalAlignment = xlCenter
'function ends normally
PARSE_MDF = True
'no file was selected
Else
PARSE_MDF = False
End If
Application.EnableEvents = True
End Function
What you need to do is to pass the found filename as argument to the function. Now your function does not have an argument to pass it on so first of all create one such for ex.
Function PARSE_MDF(ByVal myFilePath as String) As Boolean
Then you need to change your loop to call the function correctly for ex.
Do While sFile_Name <> ""
....
myboolenvaluetohodthereturnvalue = PARSE_MDF(sFile_Name)
.... 'does your funtion need to return value and be tested?
sFile_Name = Dir() 'Call dir again without parameter to skip to next found file
Loop
Otherwise I haven't checked your code but this should get you started..

Resources