I haven't seen this addressed yet, but I think that might be because I don't know how to phrase my problem concisely. Here's an example of what I'd like to try and do:
Given a column which holds state initials check output sheet if that state has been found before. If it hasn't then populate a new cell with that state's initials and initialize the count (number of times state has been found) to one. If the state's initials are found in a cell within the output sheet then increment the count by one.
With this, if we have a 50,000 (or however many) lined excel sheet that has states in random order (states may or may not be repeated) we will be able to create a clean table which outputs which states are in the raw data sheet and how many times they appeared. Another way to think about this is coding a pivot table, but with less information.
There's a couple of ways that I've thought about how to complete this, I personally think none of these are very good ideas but we'll see.
Algorithm 1, all 50 states:
Create 50 string variables for each state, create 50 long variables for the counts
Loop through raw data sheet, if specific state found then increment appropriate count (this would require 50 if-else statements)
Output results
Overall..... terrible idea
Algorithm 2, flip-flop:
Don't create any variables
If a state is found in raw data sheet , look in output sheet to check if state has been found before
If state has been found before, increment cell adjacent by one
If state has not been found before, change next available blank cell to state initials and initialize cell adjacent to one
Go back to raw data sheet
Overall..... this could work, but I feel as if it would take forever, even with raw data sheets that aren't very big but it has the benefit of not wasting memory like the 50 states algorithm and less lines of code
On a side note, is it possible to access a workbook's (or worksheet's) cells without activating that workbook? I ask because it would make the second algorithm run much quicker.
Thank you,
Jesse Smothermon
A couple of point that will speed up your code:
You don't need to active workbooks, worksheets or ranges to access them
eg
DIM wb as workbook
DIM ws as worksheet
DIM rng as range
Set wb = Workbooks.OpenText(Filename:=filePath, Tab:=True) ' or Workbooks("BookName")
Set ws = wb.Sheets("SheetName")
Set rng = ws.UsedRange ' or ws.[A1:B2], or many other ways of specifying a range
You can now refer to the workbook/sheet/range like
rng.copy
for each cl in rng.cells
etc
Looping through cells is very slow. Much faster to copy the data to a variant array first, then loop through the array. Also, when creating a large amount of data on a sheet, better to create it in a variant array first then copy it to the sheet in one go.
DIM v As Variant
v = rng
eg if rng refers to a range 10 rows by 5 columns, v becomes an array of dim 1 to 10, 1 to 5. The 5 minutes you mention would probably be reduced to seconds at most
Sub CountStates()
Dim shtRaw As Excel.Worksheet
Dim r As Long, nr As Long
Dim dict As Object
Dim vals, t, k
Set dict = CreateObject("scripting.dictionary")
Set shtRaw = ThisWorkbook.Sheets("Raw")
vals = Range(shtRaw.Range("C2"), _
shtRaw.Cells(shtRaw.Rows.Count, "C").End(xlUp)).Value
nr = UBound(vals, 1)
For r = 1 To nr
t = Trim(vals(r, 1))
If Len(t) = 0 Then t = "Empty"
dict(t) = dict(t) + 1
Next r
For Each k In dict.keys
Debug.Print k, dict(k)
Next k
End Sub
I implemented my second algorithm to see how it would work. The code is below, I did leave out little details in the actual problem to try and be more clear and get to the core problem, sorry about that. With the code below I've added the other "parts".
Code:
' this number refers to the raw data sheet that has just been activated
totalRow = ActiveSheet.Range("A1").End(xlDown).Row
For iRow = 2 To totalRow
' These are specific to the company needs, refers to addresses
If (ActiveSheet.Cells(iRow, 2) = "BA") Then
badAddress = badAddress + 1
ElseIf (ActiveSheet.Cells(iRow, 2) = "C") Then
coverageNoListing = coverageNoListing + 1
ElseIf (ActiveSheet.Cells(iRow, 2) = "L") Then
activeListing = activeListing + 1
ElseIf (ActiveSheet.Cells(iRow, 2) = "NC") Then
noCoverageNoListing = noCoverageNoListing + 1
ElseIf (ActiveSheet.Cells(iRow, 2) = "NL") Then
inactiveListing = inactiveListing + 1
ElseIf (ActiveSheet.Cells(iRow, 2) = "") Then
noHit = noHit + 1
End If
' Algorithm beginning
' If the current cell (in state column) has something in it
If (ActiveSheet.Cells(iRow, 10) <> "") Then
' Save value into a string variable
tempState = ActiveSheet.Cells(iRow, 10)
' If this is also in a billable address make variable true
If (ActiveSheet.Cells(iRow, 2) = "C") Or (ActiveSheet.Cells(iRow, 2) = "L") Or (ActiveSheet.Cells(iRow, 2) = "NL") Then
boolStateBillable = True
End If
' Output sheet
BillableWorkbook.Activate
For tRow = 2 To endOfState
' If the current cell is the state
If (ActiveSheet.Cells(tRow, 9) = tempState) Then
' Get the current hit count of that state
tempStateTotal = ActiveSheet.Cells(tRow, 12)
' Increment the hit count by one
ActiveSheet.Cells(tRow, 12) = tempStateTotal + 1
' If the address was billable then increment billable count
If (boolStateBillable = True) Then
tempStateBillable = ActiveSheet.Cells(tRow, 11)
ActiveSheet.Cells(tRow, 11) = tempStateBillable + 1
End If
Exit For
' If the tempState is unique to the column
ElseIf (tRow = endOfState) Then
' Set state, totalCount
ActiveSheet.Cells(tRow - 1, 9) = tempState
ActiveSheet.Cells(tRow - 1, 12) = 1
' Increment the ending point of the column
endOfState = endOfState + 1
' If it's billable, indicate with number
If (boolStateBillable = True) Then
tempStateBillable = ActiveSheet.Cells(tRow - 1, 11)
ActiveSheet.Cells(tRow - 1, 11) = tempStateBillable + 1
End If
End If
Next
' Activate raw data workbook
TextFileWorkbook.Activate
' reset boolean
boolStateBillable = False
Next
I ran it once and it seems to have worked. The problem is that it took roughly five minutes or so, the original code takes 0.2 (rough guess). I think the only way to make the code perform quicker is to somehow be able to not activate the two workbooks over and over. This means that the answer is not complete but I will edit if I figure out the rest.
Note I will revisit pivot tables to see if I can do everything that I need to in them, as of now it looks like there are a couple of things that I won't be able to change but I'll check
Thank you,
Jesse Smothermon
I kept with the second algorithm. There is the dictionary option that I forgot but I'm still not very comfortable with how it works and I generally don't understand it quite yet. I played with the code for a bit and changed some thing up, it now works faster.
Code:
' In output workbook (separate sheet)
Sheets.Add.Name = "Temp_Text_File"
' Opens up raw data workbook (originally text file
Application.DisplayAlerts = False
Workbooks.OpenText Filename:=filePath, Tab:=True
Application.DisplayAlerts = True
Set TextFileWorkbook = ActiveWorkbook
totalRow = ActiveSheet.Range("A1").End(xlDown).Row
' Copy all contents of raw data workbook
Cells.Select
Selection.Copy
BillableWorkbook.Activate
' Paste raw data into "Temp_Text_File" sheet
Range("A1").Select
ActiveSheet.Paste
ActiveWorkbook.Sheets("Billable_PDF").Select
' Populate long variables
For iRow = 2 To totalRow
If (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "BA") Then
badAddress = badAddress + 1
ElseIf (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "C") Then
coverageNoListing = coverageNoListing + 1
ElseIf (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "L") Then
activeListing = activeListing + 1
ElseIf (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "NC") Then
noCoverageNoListing = noCoverageNoListing + 1
ElseIf (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "NL") Then
inactiveListing = inactiveListing + 1
ElseIf (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "") Then
noHit = noHit + 1
End If
If (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 10) <> "") Then
tempState = ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 10)
If (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "C") Or (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "L") Or (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "NL") Then
boolStateBillable = True
End If
'BillableWorkbook.Activate
For tRow = 2 To endOfState
If (ActiveSheet.Cells(tRow, 9) = tempState) Then
tempStateTotal = ActiveSheet.Cells(tRow, 12)
ActiveSheet.Cells(tRow, 12) = tempStateTotal + 1
If (boolStateBillable = True) Then
tempStateBillable = ActiveSheet.Cells(tRow, 11)
ActiveSheet.Cells(tRow, 11) = tempStateBillable + 1
End If
Exit For
ElseIf (tRow = endOfState) Then
ActiveSheet.Cells(tRow, 9) = tempState
ActiveSheet.Cells(tRow, 12) = 1
endOfState = endOfState + 1
If (boolStateBillable = True) Then
tempStateBillable = ActiveSheet.Cells(tRow, 11)
ActiveSheet.Cells(tRow, 11) = tempStateBillable + 1
End If
End If
Next
'stateOneTotal = stateOneTotal + 1
'If (ActiveSheet.Cells(iRow, 2) = "C") Or (ActiveSheet.Cells(iRow, 2) = "L") Or (ActiveSheet.Cells(iRow, 2) = "NL") Then
' stateOneBillable = stateOneBillable + 1
'End If
'ElseIf (ActiveSheet.Cells(iRow, 10) = "FL") Then
'stateTwoTotal = stateTwoTotal + 1
'If (ActiveSheet.Cells(iRow, 2) = "C") Or (ActiveSheet.Cells(iRow, 2) = "L") Or (ActiveSheet.Cells(iRow, 2) = "NL") Then
' stateTwoBillable = stateTwoBillable + 1
'End If
End If
'TextFileWorkbook.Activate
If (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "C") Or (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "L") Or (ActiveWorkbook.Sheets("Temp_Text_File").Cells(iRow, 2) = "NL") Then
billableCount = billableCount + 1
End If
boolStateBillable = False
Next
' Close raw data workbook and raw data worksheet
Application.DisplayAlerts = False
TextFileWorkbook.Close
ActiveWorkbook.Sheets("Temp_Text_File").Delete
Application.DisplayAlerts = True
Thank you for the comments and suggestions. It is very much appreciated as always.
Jesse Smothermon
Related
I am working on an excel problem that I feel will have to be based off duplicates. Basically if duplicate values are found ("A:A") then somehow group them as a variable and only populate the matching rows if atleast 1 negative number exists in ("B:B"). The same will apply to non duplicates as well where they should only populate if a negative number exists in Column B but I feel that can be done easily via formula
I have tried a few things but the main problem is identifying the duplicates as their own variable. When I create a function that behaves purely off highlighted duplicates,this would apply to all duplicates regardless of their individual text. This would be much simpler if summed but that is not the case. Below is an example of what I am going for
Problem:
IDs Trades
US9128 -500
US9128 750
EU9133 900
GD2104 -300
GD2104 150
FG5454 200
Expected:
IDs Trades
US9128 -500
US9128 750
GD2104 -300
GD2104 150
Open to other routes to this problem
Suppose your data is in column(A:B),Starting form row(1)
Try this Macro
Option Explicit
Sub test_me()
Dim obj As Object
Dim x, k%
Dim R%, C%
R = 2: C = 4
Dim lr%: lr = Cells(Rows.Count, 1).End(3).Row
Dim i%, j%
Range("d2").CurrentRegion.ClearContents
Set obj = CreateObject("System.Collections.SortedList")
For i = 2 To lr
obj.Add Cells(i, 2).Value, Cells(i, 1).Value
For j = i + 1 To lr
If Cells(j, 1) = Cells(i, 1) Then
obj.Add Cells(j, 2).Value, Cells(j, 1).Value
End If
Next j
x = obj.Count
If x = 1 Then GoTo NEXT_I
With Cells(R, C)
.Value = obj.GetByIndex(0): .Offset(, 1) = obj.Getkey(0)
.Offset(1) = obj.GetByIndex(x - 1): .Offset(1, 1) = obj.Getkey(x - 1)
End With
R = R + 2
NEXT_I:
obj.Clear
Next i
Set obj = Nothing
End Sub
I have a code that searches an excel fill for the word distance, takes the value of the cell next to it, paste it into a new cell, then sums all the cells up. Which works great, but i now need to find a way to only sum ever even row number. Does that make sense?
Sub Distance_Check()
Dim DistanceCheck As String
Dim DistanceNumber As String
Dim DistanceSum As String
Dim DistanceTotal As String
DistanceCheck = MsgBox("Would you like to check the distance?", vbYesNo)
If DistanceCheck = vbYes Then
If IsArray(fname) Then Workbooks.OpenText fname(1)
i = 1
findStr = "Distance"
Set foundCel = Range("A:A").Find(what:=findStr)
If Not foundCel Is Nothing Then
firstAddress = foundCel.Address
Do
Range("J" & i).Value = foundCel.Offset(0, 1).Value
Set foundCel = Range("A:A").FindNext(foundCel)
i = i + 1
Loop While Not foundCel Is Nothing And foundCel.Address <> firstAddress
End If
Set wkbSourceBook = ActiveWorkbook
DistanceNumber = i - 2
DistanceSum = WorksheetFunction.Sum(Range(Cells(2, 15), (Cells(DistanceNumber + 1, 15))))
DistanceTotal = DistanceSum / DistanceNumber
If DistanceNumber = Cells(2, 12) Then
MsgBox ("No error found wihin distance")
Else
MsgBox ("Error found with distance")
End If
Else
End If
Call Save_Data
End Sub
Would the way youd go about this be using a for loop on the
cells(DistanceNumber(j,+1)
Where j = 0,
j = j +2 ,
Until j > DistanceNumber,
Would that work? If so how would you go about it?
Thanks
A quick way to step through a loop in the desired increments is to use the Mod operator which divides two numbers and returns any remainder (e.g. 7 mod 2 = 1, as two sixes fit into seven, leaving one).
You can use the row property of the range you identify with the Find method, and since you want to jump by two the modulo should be zero:
If foundcel.Row Mod 2 = 0 Then Range("J" & i).value = foundcel.Offset(0, 1).Value
That said, there is a 'built in' way to step through a loop if using a For loop like this
For x = 2 to 10 Step 2
' Do stuff
Next x
You can also use this method to step backwards, e.g.
For x = 100 to 0 Step -10
' Do stuff backwards!
Next x
I have a spreadsheet with a column filled with strings. There are sometimes several instances of one string (I've sorted it though so they're all grouped together) so I've created a macro that looks at the current row in the column, determines if its already been posted in the second worksheet, and if it has, it moves to the next row, if it hasn't it posts it and then moves on.
It works perfectly until the end where it completely ignores the last entry. The last entry is one of the instances where there's several of the same strings so the issue almost definitely isn't that its just ignoring the last row.
It's ignoring the last entry completely no matter how many times it repeats. New to vba so any help appreciated.
Sub RCFS()
Dim ProfCtr As String
Dim S2FreecellH As Long
Dim ProfCenCellH As Long
S2FreecellH = 3
ProfCenCellH = 2
ProfCtr = Cells(ProfCenCellH, 4)
Worksheets("Sheet2").Cells(S2FreecellH, 1).Value = ProfCtr
While IsEmpty(Cells(ProfCenCellH, 4).Value) = False
If Cells(ProfCenCellH, 4).Value <> ProfCtr Then
Worksheets("Sheet2").Cells(S2FreecellH, 1) = ProfCtr
S2FreecellH = S2FreecellH + 1
ProfCtr = Cells(ProfCenCellH, 4)
ProfCenCellH = ProfCenCellH + 1
Else
ProfCenCellH = ProfCenCellH + 1
End If
Wend
End Sub
Indeed, your loop needs to still flush the value of ProfCtr even when you reached the empty cell. You can make it work by looking ahead one row in your If condition (add + 1).
Also included a couple of other improvements:
Do While Not IsEmpty(Cells(ProfCenCellH, 4).Value)
If Cells(ProfCenCellH + 1, 4).Value <> ProfCtr Then
Worksheets("Sheet2").Cells(S2FreecellH, 1) = ProfCtr
S2FreecellH = S2FreecellH + 1
ProfCtr = Cells(ProfCenCellH, 4)
End If
ProfCenCellH = ProfCenCellH + 1
Loop
Let me see if I have this right
' Stop when you reach an empty cell
While IsEmpty(Cells(ProfCenCellH, 4).Value) = False
' If the cell's value matches the variable ProfCtr then
' ignore it and move to the next row.
If Cells(ProfCenCellH, 4).Value <> ProfCtr Then
' But if it -doesn't- match, then copy the variable
' ProfCtr to Sheet 2.
Worksheets("Sheet2").Cells(S2FreecellH, 1) = ProfCtr
' Increment your rows
S2FreecellH = S2FreecellH + 1
' Set the next bit of text to copy
ProfCtr = Cells(ProfCenCellH, 4)
ProfCenCellH = ProfCenCellH + 1
Else
ProfCenCellH = ProfCenCellH + 1
End If
Shouldn't you be setting Prof Ctr = Cells(ProfCenCelH, 4) BEFORE doing the copying over to Sheet2?
trincot has the correct answer. Here is an alternate technique, using built-in method, that will accomplish the same task.
With Worksheets("Sheet1")
.Range("D1", .Range("D" & .Rows.Count)).AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Sheet2.Range("A1"), Unique:=True
End With
I'm a beginner at using visual basic, and I've run into a little problem. What I want to do is take an excel spreadsheet and search through a specific column in order to find a name, then grab everything in that row in order to transfer it over to another sheet. I've got everything running except that I run into an endless loop for a reason I do not understand.
'set up a for loop that increments through all sheets in the workbook
For i = 1 To ThisWorkbook.Sheets.Count
'set up a temp page to work with the current page
Set tem = ThisWorkbook.Sheets(i)
'increment through all the rows that have data in them
For Each rng In tem.Rows
'if the data matches what was searched for, copy it into another worksheet
If tem.Cells(ct, 4) = SForm.Text Then
sr.Cells(spot, 1) = tem.Cells(ct, 1)
sr.Cells(spot, 2) = tem.Cells(ct, 2)
sr.Cells(spot, 3) = tem.Cells(ct, 3)
sr.Cells(spot, 4) = tem.Cells(ct, 4)
sr.Cells(spot, 5) = tem.Cells(ct, 5)
sr.Cells(spot, 6) = tem.Cells(ct, 6)
sr.Cells(spot, 7) = tem.Cells(ct, 7)
sr.Cells(spot, 8) = tem.Cells(ct, 8)
sr.Cells(spot, 9) = tem.Cells(ct, 9)
sr.Cells(spot, 10) = tem.Cells(ct, 10)
sr.Cells(spot, 11) = tem.Cells(ct, 11)
sr.Cells(spot, 12) = tem.Cells(ct, 12)
'increment the placeholder for the new sheet
spot = spot + 1
End If
'increase ct to keep track of where in the worksheet it is
ct = ct + 1
Next rng
'reset ct for the next worksheet
ct = 1
Next i
The Specific issue I run into, is an overflow because ct is an int and maxes out. which tells me I have an endless loop on my hands.
Thanks for the help in advance.
You are iterating over 1 million times per sheet, this is going to take a while. Find the last cell in column D and only iterate through those rows
Plus you do not need ct. instead use rng.row in its place.
Then combine the whole value assignation into one line.
For i = 1 To ThisWorkbook.Sheets.Count
'set up a temp page to work with the current page
Set tem = ThisWorkbook.Sheets(i)
'increment through all the rows that have data in them
For Each rng In tem.Range("D1", tem.Cells(tem.Rows.Count, 4).End(xlUp))
'if the data matches what was searched for, copy it into another worksheet
If rng.Value = SForm.Text Then
sr.Range(sr.Cells(spot, 1), sr.Cells(spot, 12)).Value = tem.Range(tem.Cells(rng.Row, 1), tem.Cells(rng.Row, 12)).Value
'increment the placeholder for the new sheet
spot = spot + 1
End If
Next rng
Next i
To make it even quicker look into dropping the whole range into an array and output into another array to minimize the interaction between the sheets and vba.
First time poster and new to programming in general. I have a project in which i have to build a financial model to mine for data in excel. I have succeeded in building said model on VBA. I have ran tests on 3,000 line dataset and it was successful. I will briefly explain what it does.
I track a given stock on a given day on multiple exchanges. I download the data (roughly 935,000 lines) The first step is copy all the data for a given exchange (roughly 290,000) onto a new sheet (this takes roughly 8min), then I create a new column to log the bid ask spread (12secs), the next step is what Im having trouble with, I basically rank each line of data twice, one column for Bid size and one column for Ask size. I created a function which uses excel Percentile function and ranks based on where the given bid and ask size lands. As of right now, I have been running the Macro for the last 35min and has yet to execute. I cant attempt the other macros since each macro depends on the previous one.
So my basic issue is that since my data set is large, my model keeps crashing.The code seems to be fine when working with the test data, and it doesn't throw any errors when I run the program, but with the larger data set it just crashes. Does anyone have any suggestions? Is this normal with such large amounts of data?
Thanks in advance.
Sham
Here is the sub and function thats giving me the trouble, the sub takes in the required inputs to run the function and then pops into the assigned cell. The code is suppose to repeat the process for three separate sheets. For now, Id like it to work on one sheet, hence used the comments to not include the loop
Sub Bucketting()
Dim firstRow As Long
Dim lastRow As Long
Dim counter As Long
Dim bidRange As Range
Dim offerRange As Range
Dim bidScroll As Range
Dim offerScroll As Range
Dim Ex As String
Dim i As Integer
'For i = 1 To 1 Step 1 'Sheet Selection Process
' If i = 1 Then
' Ex = "Z"
' ElseIf i = 2 Then
' Ex = "P"
' Else
' Ex = "T"
' End If
Sheets("Z").Select 'Sheet selected
With ActiveSheet
firstRow = .UsedRange.Cells(1).Row + 1
lastRow = .UsedRange.Rows.Count
Set bidRange = .Range("F2:F" & lastRow)
Set offerRange = .Range("G2:G" & lastRow)
For counter = lastRow To firstRow Step -1
Set bidScroll = .Range("F" & counter)
Set offerScroll = .Range("G" & counter)
With .Cells(counter, "J")
.Value = DECILE_RANK(bidRange, bidScroll)
End With
With .Cells(counter, "K")
.Value = DECILE_RANK(offerRange, offerScroll)
End With
Next counter
End With
Range("J1").Select
ActiveCell = "Bid Rank"
ActiveCell.Offset(0, 1) = "Offer Rank"
'Next i
End Sub
Function DECILE_RANK(DataRange, RefCell)
'Credit: BJRaid
'DECILE_RANK(The Range of data)
'Declares the function that can be called in the spreadsheet cell - enter '=DECILE_RANK(A5:A50,A5)
'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC1 = Application.WorksheetFunction.Percentile(DataRange, 0.1)
DEC2 = Application.WorksheetFunction.Percentile(DataRange, 0.2)
DEC3 = Application.WorksheetFunction.Percentile(DataRange, 0.3)
DEC4 = Application.WorksheetFunction.Percentile(DataRange, 0.4)
DEC5 = Application.WorksheetFunction.Percentile(DataRange, 0.5)
DEC6 = Application.WorksheetFunction.Percentile(DataRange, 0.6)
DEC7 = Application.WorksheetFunction.Percentile(DataRange, 0.7)
DEC8 = Application.WorksheetFunction.Percentile(DataRange, 0.8)
DEC9 = Application.WorksheetFunction.Percentile(DataRange, 0.9)
' Calculate the Decile rank that the reference cell value sits within
If (RefCell <= DEC1) Then DECILE_RANK = 1
If (RefCell > DEC1) And (RefCell <= DEC2) Then DECILE_RANK = 2
If (RefCell > DEC2) And (RefCell <= DEC3) Then DECILE_RANK = 3
If (RefCell > DEC3) And (RefCell <= DEC4) Then DECILE_RANK = 4
If (RefCell > DEC4) And (RefCell <= DEC5) Then DECILE_RANK = 5
If (RefCell > DEC5) And (RefCell <= DEC6) Then DECILE_RANK = 6
If (RefCell > DEC6) And (RefCell <= DEC7) Then DECILE_RANK = 7
If (RefCell > DEC7) And (RefCell <= DEC8) Then DECILE_RANK = 8
If (RefCell > DEC8) And (RefCell <= DEC9) Then DECILE_RANK = 9
If (RefCell > DEC9) Then DECILE_RANK = 10
End Function
935,000 lines is a lot for excel. Like, really a lot. Barring saying using a real database, If your application is literally putting a =Percentile(...) in each cell, I would recommend Trying to use another tool for that. Perhaps something within VBA itself. More generally, use something outside of a cell - then store the result value in the cell. There is a lot of overhead in maintaining those formulas that are interdependent on 935k rows of data.
The problem is that your looping through each row individually, the Excel way is to try and work with whole ranges at once whenever possible. I would load the ranges into arrays, then modify your DECILE_RANK code to work with the items in the array.
Note that variant arrays that read ranges in are 2-D.
Here is the fully functioning code including my custom VBA array slicer. Note that it was only tested on a small dataset:
Sub Bucketting()
Dim lastRow As Long
Dim bidArray As Variant
Dim offerArray As Variant
Sheets("Sheet1").Select 'Sheet selected
With ActiveSheet
lastRow = .UsedRange.Rows.Count + 1
bidArray = .Range("F2:F" & lastRow)
offerArray = .Range("G2:G" & lastRow)
Range("J2:J" & lastRow).Value = GetArraySlice2D(DECILE_RANK(bidArray), "column", 1, 1, 0)
Range("K2:K" & lastRow).Value = GetArraySlice2D(DECILE_RANK(offerArray), "column", 1, 1, 0)
End With
Range("J1").Select
ActiveCell = "Bid Rank"
ActiveCell.Offset(0, 1) = "Offer Rank"
End Sub
Function DECILE_RANK(DataRange As Variant) As Variant
' Credit: BJRaid
' DECILE_RANK(The Range of data)
' Declares the function that can be called in the spreadsheet cell - enter '=DECILE_RANK(A5:A50,A5)
Dim DEC(0 To 10) As Variant
Dim i As Integer, j As Integer
'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC(0) = 0
For i = 1 To 9
DEC(i) = Application.WorksheetFunction.Percentile(DataRange, 0.1 * i)
Next i
DEC(10) = Application.WorksheetFunction.Max(DataRange)
' Calculate the Decile rank that the reference cell value sits within
For i = 1 To UBound(DataRange, 1)
For j = 1 To 10
If ((DataRange(i, 1) > DEC(j - 1)) And (DataRange(i, 1) <= DEC(j))) Then
DataRange(i, 1) = j
Exit For
End If
Next j
Next i
DECILE_RANK = DataRange
End Function
Public Function GetArraySlice2D(Sarray As Variant, Stype As String, Sindex As Integer, Sstart As Integer, Sfinish As Integer) As Variant
' this function returns a slice of an array, Stype is either row or column
' Sstart is beginning of slice, Sfinish is end of slice (Sfinish = 0 means entire
' row or column is taken), Sindex is the row or column to be sliced (NOTE:
' 1 is always the first row or first column)
' an Sindex value of 0 means that the array is one dimensional 3/20/09 Lance Roberts
Dim vtemp() As Variant
Dim i As Integer
On Err GoTo ErrHandler
Select Case Sindex
Case 0
If Sfinish - Sstart = UBound(Sarray) - LBound(Sarray) Then
vtemp = Sarray
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(i + Sstart - 1)
Next i
End If
Case Else
Select Case Stype
Case "row"
If Sfinish = 0 Or (Sstart = LBound(Sarray, 2) And Sfinish = UBound(Sarray, 2)) Then
vtemp = Application.WorksheetFunction.Index(Sarray, Sindex, 0)
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(Sindex, i + Sstart - 1)
Next i
End If
Case "column"
If Sfinish = 0 Or (Sstart = LBound(Sarray, 1) And Sfinish = UBound(Sarray, 1)) Then
vtemp = Application.WorksheetFunction.Index(Sarray, 0, Sindex)
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(i + Sstart - 1, Sindex)
Next i
End If
End Select
End Select
GetArraySlice2D = vtemp
Exit Function
ErrHandler:
Dim M As Integer
M = MsgBox("Bad Array Input", vbOKOnly, "GetArraySlice2D")
End Function
I'm not sure if this will directly address your problem, but have you considered using Application.ScreenUpdating = False? Don't forget to set it back to true once your data has processed.