First time poster and new to programming in general. I have a project in which i have to build a financial model to mine for data in excel. I have succeeded in building said model on VBA. I have ran tests on 3,000 line dataset and it was successful. I will briefly explain what it does.
I track a given stock on a given day on multiple exchanges. I download the data (roughly 935,000 lines) The first step is copy all the data for a given exchange (roughly 290,000) onto a new sheet (this takes roughly 8min), then I create a new column to log the bid ask spread (12secs), the next step is what Im having trouble with, I basically rank each line of data twice, one column for Bid size and one column for Ask size. I created a function which uses excel Percentile function and ranks based on where the given bid and ask size lands. As of right now, I have been running the Macro for the last 35min and has yet to execute. I cant attempt the other macros since each macro depends on the previous one.
So my basic issue is that since my data set is large, my model keeps crashing.The code seems to be fine when working with the test data, and it doesn't throw any errors when I run the program, but with the larger data set it just crashes. Does anyone have any suggestions? Is this normal with such large amounts of data?
Thanks in advance.
Sham
Here is the sub and function thats giving me the trouble, the sub takes in the required inputs to run the function and then pops into the assigned cell. The code is suppose to repeat the process for three separate sheets. For now, Id like it to work on one sheet, hence used the comments to not include the loop
Sub Bucketting()
Dim firstRow As Long
Dim lastRow As Long
Dim counter As Long
Dim bidRange As Range
Dim offerRange As Range
Dim bidScroll As Range
Dim offerScroll As Range
Dim Ex As String
Dim i As Integer
'For i = 1 To 1 Step 1 'Sheet Selection Process
' If i = 1 Then
' Ex = "Z"
' ElseIf i = 2 Then
' Ex = "P"
' Else
' Ex = "T"
' End If
Sheets("Z").Select 'Sheet selected
With ActiveSheet
firstRow = .UsedRange.Cells(1).Row + 1
lastRow = .UsedRange.Rows.Count
Set bidRange = .Range("F2:F" & lastRow)
Set offerRange = .Range("G2:G" & lastRow)
For counter = lastRow To firstRow Step -1
Set bidScroll = .Range("F" & counter)
Set offerScroll = .Range("G" & counter)
With .Cells(counter, "J")
.Value = DECILE_RANK(bidRange, bidScroll)
End With
With .Cells(counter, "K")
.Value = DECILE_RANK(offerRange, offerScroll)
End With
Next counter
End With
Range("J1").Select
ActiveCell = "Bid Rank"
ActiveCell.Offset(0, 1) = "Offer Rank"
'Next i
End Sub
Function DECILE_RANK(DataRange, RefCell)
'Credit: BJRaid
'DECILE_RANK(The Range of data)
'Declares the function that can be called in the spreadsheet cell - enter '=DECILE_RANK(A5:A50,A5)
'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC1 = Application.WorksheetFunction.Percentile(DataRange, 0.1)
DEC2 = Application.WorksheetFunction.Percentile(DataRange, 0.2)
DEC3 = Application.WorksheetFunction.Percentile(DataRange, 0.3)
DEC4 = Application.WorksheetFunction.Percentile(DataRange, 0.4)
DEC5 = Application.WorksheetFunction.Percentile(DataRange, 0.5)
DEC6 = Application.WorksheetFunction.Percentile(DataRange, 0.6)
DEC7 = Application.WorksheetFunction.Percentile(DataRange, 0.7)
DEC8 = Application.WorksheetFunction.Percentile(DataRange, 0.8)
DEC9 = Application.WorksheetFunction.Percentile(DataRange, 0.9)
' Calculate the Decile rank that the reference cell value sits within
If (RefCell <= DEC1) Then DECILE_RANK = 1
If (RefCell > DEC1) And (RefCell <= DEC2) Then DECILE_RANK = 2
If (RefCell > DEC2) And (RefCell <= DEC3) Then DECILE_RANK = 3
If (RefCell > DEC3) And (RefCell <= DEC4) Then DECILE_RANK = 4
If (RefCell > DEC4) And (RefCell <= DEC5) Then DECILE_RANK = 5
If (RefCell > DEC5) And (RefCell <= DEC6) Then DECILE_RANK = 6
If (RefCell > DEC6) And (RefCell <= DEC7) Then DECILE_RANK = 7
If (RefCell > DEC7) And (RefCell <= DEC8) Then DECILE_RANK = 8
If (RefCell > DEC8) And (RefCell <= DEC9) Then DECILE_RANK = 9
If (RefCell > DEC9) Then DECILE_RANK = 10
End Function
935,000 lines is a lot for excel. Like, really a lot. Barring saying using a real database, If your application is literally putting a =Percentile(...) in each cell, I would recommend Trying to use another tool for that. Perhaps something within VBA itself. More generally, use something outside of a cell - then store the result value in the cell. There is a lot of overhead in maintaining those formulas that are interdependent on 935k rows of data.
The problem is that your looping through each row individually, the Excel way is to try and work with whole ranges at once whenever possible. I would load the ranges into arrays, then modify your DECILE_RANK code to work with the items in the array.
Note that variant arrays that read ranges in are 2-D.
Here is the fully functioning code including my custom VBA array slicer. Note that it was only tested on a small dataset:
Sub Bucketting()
Dim lastRow As Long
Dim bidArray As Variant
Dim offerArray As Variant
Sheets("Sheet1").Select 'Sheet selected
With ActiveSheet
lastRow = .UsedRange.Rows.Count + 1
bidArray = .Range("F2:F" & lastRow)
offerArray = .Range("G2:G" & lastRow)
Range("J2:J" & lastRow).Value = GetArraySlice2D(DECILE_RANK(bidArray), "column", 1, 1, 0)
Range("K2:K" & lastRow).Value = GetArraySlice2D(DECILE_RANK(offerArray), "column", 1, 1, 0)
End With
Range("J1").Select
ActiveCell = "Bid Rank"
ActiveCell.Offset(0, 1) = "Offer Rank"
End Sub
Function DECILE_RANK(DataRange As Variant) As Variant
' Credit: BJRaid
' DECILE_RANK(The Range of data)
' Declares the function that can be called in the spreadsheet cell - enter '=DECILE_RANK(A5:A50,A5)
Dim DEC(0 To 10) As Variant
Dim i As Integer, j As Integer
'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC(0) = 0
For i = 1 To 9
DEC(i) = Application.WorksheetFunction.Percentile(DataRange, 0.1 * i)
Next i
DEC(10) = Application.WorksheetFunction.Max(DataRange)
' Calculate the Decile rank that the reference cell value sits within
For i = 1 To UBound(DataRange, 1)
For j = 1 To 10
If ((DataRange(i, 1) > DEC(j - 1)) And (DataRange(i, 1) <= DEC(j))) Then
DataRange(i, 1) = j
Exit For
End If
Next j
Next i
DECILE_RANK = DataRange
End Function
Public Function GetArraySlice2D(Sarray As Variant, Stype As String, Sindex As Integer, Sstart As Integer, Sfinish As Integer) As Variant
' this function returns a slice of an array, Stype is either row or column
' Sstart is beginning of slice, Sfinish is end of slice (Sfinish = 0 means entire
' row or column is taken), Sindex is the row or column to be sliced (NOTE:
' 1 is always the first row or first column)
' an Sindex value of 0 means that the array is one dimensional 3/20/09 Lance Roberts
Dim vtemp() As Variant
Dim i As Integer
On Err GoTo ErrHandler
Select Case Sindex
Case 0
If Sfinish - Sstart = UBound(Sarray) - LBound(Sarray) Then
vtemp = Sarray
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(i + Sstart - 1)
Next i
End If
Case Else
Select Case Stype
Case "row"
If Sfinish = 0 Or (Sstart = LBound(Sarray, 2) And Sfinish = UBound(Sarray, 2)) Then
vtemp = Application.WorksheetFunction.Index(Sarray, Sindex, 0)
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(Sindex, i + Sstart - 1)
Next i
End If
Case "column"
If Sfinish = 0 Or (Sstart = LBound(Sarray, 1) And Sfinish = UBound(Sarray, 1)) Then
vtemp = Application.WorksheetFunction.Index(Sarray, 0, Sindex)
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(i + Sstart - 1, Sindex)
Next i
End If
End Select
End Select
GetArraySlice2D = vtemp
Exit Function
ErrHandler:
Dim M As Integer
M = MsgBox("Bad Array Input", vbOKOnly, "GetArraySlice2D")
End Function
I'm not sure if this will directly address your problem, but have you considered using Application.ScreenUpdating = False? Don't forget to set it back to true once your data has processed.
Related
I am new to VBA macro and need some experts help on meeting the below requirement.
I got a workbook containing 2 sheets called 'Data' and 'Stats'.
'Data' contains the values as below
'Stats' contains the values as below
On click on the button, I would like to do the below
Get the values in column A in 'Stats' sheet
Find all the matching rows in 'Data' Sheet
Find the smallest start time and put that in 'Stats' sheet against the stage value
Find the biggest end time and that in 'Stats' sheet against the stage value
Final output would be like below
Note: I do not have the MINIFS or MAXIFS in my installation.
Incase you dont have MINIFS and MAXIFS you can use array formulas like so:
={MIN(IF(Stats!A1=Data!$A$1:$A$1000,Data!$C$1:$C$1000))}
and
={MAX(IF(Stats!A1=Data!$A$1:$A$1000,Data!$B$1:$B$1000))}
The {} indicates, that this is a Array-Formula. Enter with Ctrl + Shift + Enter
No VBA needed.
Just use in your Stats worksheet the following formula for Start:
=MINIFS(Data!A:A,Data!C:C,Stats!A:A)
and the following for End:
=MAXIFS(Data!B:B,Data!C:C,Stats!A:A)
Please, the VBA solution, too. It will be very fast, using arrays, processing everything in memory and dropping the result at once:
Sub BringStats()
Dim shD As Worksheet, shS As Worksheet, lastRD As Long, lastRS As Long
Dim arrD, arrS, i As Long, k As Long, dict As Object, El As Variant
Set shD = Worksheets("Data")
Set shS = Worksheets("Stats")
lastRD = shD.Range("A" & rows.count).End(xlUp).row
lastRS = shS.Range("A" & rows.count).End(xlUp).row
arrD = shD.Range("A2:C" & lastRD).Value
arrS = shS.Range("A2:C" & lastRS).Value
Set dict = CreateObject("Scripting.dictionary")
'load the dictionary with unique keys and all corresponding date in a string, as item
For i = 1 To UBound(arrD)
If Not dict.Exists(arrD(i, 3)) Then
dict.Add arrD(i, 3), CDate(arrD(i, 1)) & ";" & CDate(arrD(i, 2))
Else
dict(arrD(i, 3)) = dict(arrD(i, 3)) & "|" & CDate(arrD(i, 1)) & ";" & CDate(arrD(i, 2))
End If
Next
Dim arr As Variant, minTime As Date, minPos As Long
For i = 1 To UBound(arrS)
If dict.Exists(arrS(i, 1)) Then
arr = Split(dict(arrS(i, 1)), "|") 'extract each pair of time stamps
If UBound(arr) > 0 Then
For Each El In arr 'extract the element containing minimum time
If minTime = 0 Then
minTime = TimeValue(Split(El, ";")(0)): minPos = k
Else
If TimeValue(Split(El, ";")(0)) < minTime Then minTime = TimeValue(Split(El, ";")(0)): minPos = k
End If
k = k + 1
Next
arrS(i, 2) = Split(arr(minPos), ";")(0): arrS(i, 3) = Split(arr(minPos), ";")(1) 'load the array with the minimum time correspondent values
Else
arrS(i, 2) = Split(dict(arrS(i, 1)), ";")(0): arrS(i, 3) = Split(dict(arrS(i, 1)), ";")(1)'loading the array in case of only one occurrence
End If
End If
minPos = 0: minTime = 0: k = 0 'reinitialize the used variables
Next i
'drop the processed array at once
shS.Range("A2").Resize(UBound(arrS), UBound(arrS, 2)).Value = arrS
End Sub
There can be a lot of the same 'stage' occurrences...
I have a dynamic population of tasks that I receive monthly, I want to assign these tasks to 6 groups for them to assess in an evenly distributed fashion. Each task has a ranking/priority, so if a group receives the number 1 task, I don't want to also give that same group the top 100 priorities. I want to apply a snaking/zigzag distribution.
This led me down a path to using the formula =MIN(MOD(ROW()-2,12),MOD(-ROW()+1,12)). I get the distribution I'm looking for, although at this stage I don't know how to account for any criteria that I need to add into my logic.
In the above image I am trying to take the groups in column F and apply them to Column D. Column E shows the example of the =MOD() formula and I could just use a lookup to replace the Mod values of 0-5 with my groups 1-6.
Where I hit a roadblock is in row 21, which is where I want to account for some criteria, or exceptions. I added a binary column A for visualization, but essentially, I want to say where column C (task location) = Loc4 to never assign a task to Group 4. In the instance where I don't want a task to be assigned to Group 4 at Loc 4, I then hope to skip Group 4 for a single assignment until it can be applied to the next possible ranked task. The simple solution is to remove all these occurrences at the end, but it really skews the even distribution I'm going for.
I tried to apply solver to this assignment, looking for the lowest std deviation, but I have too many data points.
This has led me to another post using some vba logic, which I really like the concept, but I cannot figure out how to modify it to account for some exceptions. enter link description here
Ideally I would love to use this concept of creating a single array of my groups, applying each group to a task as long as the criteria is met in this short list of task, write the groups into the list, reset and move down to the next subset of tasks. So, each time I select the next 6 tasks, they are assigned to one of my 6 groups, which will keep the distribution that I'm hoping for.
Here is the code from the user K.Davis post which I was trying to apply:
Sub assignEmployeeTasks()
Dim ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets(1)
Dim employeeList() As Variant
With ws
For i = 2 To lastRow(ws, 2)
If (Not employeeList) = -1 Then
'rebuild employeelist / array uninitialized
employeeList = buildOneDimArr(ws, "F", 2, lastRow(ws, "F"))
End If
.Cells(i, 4) = randomEmployee(employeeList)
Next
End With
End Sub
These are the "support" functions that allow your program to do it's job:
Function randomEmployee(ByRef employeeList As Variant) As String
'Random # that will determine the employee chosen
Dim Lotto As Long
Lotto = randomNumber(LBound(employeeList), UBound(employeeList))
randomEmployee = employeeList(Lotto)
'Remove the employee from the original array before returning it to the sub
Dim retArr() As Variant, i&, x&, numRem&
numRem = UBound(employeeList) - 1
If numRem = -1 Then 'array is empty
Erase employeeList
Exit Function
End If
ReDim retArr(numRem)
For i = 0 To UBound(employeeList)
If i <> Lotto Then
retArr(x) = employeeList(i)
x = x + 1
End If
Next i
Erase employeeList
employeeList = retArr
End Function
' This will take your column of employees and place them in a 1-D array
Function buildOneDimArr(ByVal ws As Worksheet, ByVal Col As Variant, _
ByVal rowStart As Long, ByVal rowEnd As Long) As Variant()
Dim numElements As Long, i As Long, x As Long, retArr()
numElements = rowEnd - rowStart
ReDim retArr(numElements)
For i = rowStart To rowEnd
retArr(x) = ws.Cells(i, Col)
x = x + 1
Next i
buildOneDimArr = retArr
End Function
' This outputs a random number so you can randomly assign your employee
Function randomNumber(ByVal lngMin&, ByVal lngMax&) As Long
'Courtesy of https://stackoverflow.com/a/22628599/5781745
Randomize
randomNumber = Int((lngMax - lngMin + 1) * Rnd + lngMin)
End Function
' This gets the last row of any column you specify in the arguments
Function lastRow(ws As Worksheet, Col As Variant) As Long
lastRow = ws.Cells(ws.Rows.Count, Col).End(xlUp).Row
End Function
Any help would be much appreciated! I'm up for any path that gets closer to my desired solution, formulas or vba. Please let me know if you have any questions.
Thanks!
This program works on a block of tasks at a time where the block size is a multiple of the group count. I have SIZE=2 to give a block size of 12 as this gives more opportunity to resolve conflicts than 6 would. It works by initially assigning the tasks to a zigzap pattern and then validates this against the rules you define. These are in the module validLocn(). If the validation is OK the process moves down the sheet to the next block. If the validation fails the plan is shuffled by swapping 2 element randomly chosen and validation retried. This continues up the maximum number set by MAXTRY. If still not resolved the user can chose to retry again, ignore and move on or abort the process. I have tested it with 150,000 records and it took less than a minute, but my test data may not represent your real data. The results in a pivot table on cols B,C,D show even distribution and no tasks for Grp4 at Locn4.
Count of Task
Row Loc1 Loc2 Loc3 Loc4 Loc5 Loc6 Gand Total
Gp1 4013 3975 3926 5082 3986 4018 25000
Gp2 4021 3992 4077 4928 3975 4007 25000
Gp3 3976 3952 4027 5023 4049 3973 25000
Gp4 5050 4915 4936 5035 5064 25000
Gp5 4072 3996 4034 4890 3969 4039 25000
Gp6 3964 4087 3986 5018 3996 3949 25000
Grand 25096 24917 24986 24941 25010 25050 150000
Hope that helps.
Option Explicit
Sub assignEmployeeTasks()
Dim ws As Worksheet, t0 As Single, t1 As Single
Set ws = ThisWorkbook.Sheets("Sheet1")
t0 = Timer
Const COL_GROUP = "F"
Const COL_LOCN = "C"
Const SIZE As Integer = 2 ' plan size = 2 * group count
Const MAXTRY = 50 ' no of tries to validate
Dim bOK As Boolean
Dim grp As Variant, iBlockStart As Long, i As Integer, r As Integer, step As Integer
'initialize grps and location
Dim countGrp As Integer, lastLocn As Long
lastLocn = ws.Range(COL_LOCN & Rows.Count).End(xlUp).Row
countGrp = ws.Range(COL_GROUP & Rows.Count).End(xlUp).Row - 1
grp = ws.Range(COL_GROUP & "2").Resize(countGrp, 1).Value
Dim plan() As String
ReDim plan(countGrp * SIZE, 2)
Dim itry As Integer, res
iBlockStart = 1
Do While iBlockStart < lastLocn
' initialize plan
Call zigzag(plan, grp)
For i = 1 To UBound(plan)
plan(i, 1) = ws.Range("C" & iBlockStart + i).Value
Next
' save 1st attempt
For i = 1 To UBound(plan)
ws.Range("D" & iBlockStart + i).Value = plan(i, 2)
Next
' validate
bOK = validLocn(plan, 0)
retry:
' retry to validate
itry = 0
While bOK = False And itry < MAXTRY
Call shuffle(plan, 1)
bOK = validLocn(plan, itry)
itry = itry + 1
Wend
' write new plan to sheet
For i = 1 To UBound(plan)
ws.Range("D" & iBlockStart + i).Value = plan(i, 2)
Next
' check rule again
If itry = MAXTRY Then
ws.Range(COL_LOCN & iBlockStart).Select
res = MsgBox("Failed to vaidate after " & MAXTRY & " attempts", vbAbortRetryIgnore, iBlockStart)
If res = vbRetry Then GoTo retry
If res = vbAbort Then Exit Sub
End If
iBlockStart = iBlockStart + UBound(plan)
Loop
t1 = Timer
MsgBox "Assigned " & lastLocn - 1 & " tasks in " & Int(t1 - t0) & " secs"
End Sub
' valid plan against rules
Function validLocn(plan As Variant, itry) As Boolean
Dim sLocn As String, sGrp As String, i As Integer
validLocn = True
For i = 1 To UBound(plan)
sLocn = plan(i, 1)
sGrp = plan(i, 2)
' rule 1
If sGrp = "Gp4" And sLocn = "Loc4" Then
validLocn = False
'Debug.Print itry, i, "Fail Rule 1", sGrp, sLocn
Else
'Debug.Print itry, i, "Pass Rule 1", sGrp, sLocn
End If
Next
End Function
' populate plan groups
Sub zigzag(plan As Variant, grp As Variant)
Dim i As Integer, r As Integer, step As Integer
r = 1: step = 1
For i = 1 To UBound(plan)
plan(i, 2) = grp(r, 1)
r = r + step
If r > UBound(grp) Then
r = UBound(grp)
step = -1
ElseIf r < 1 Then
r = 1
step = 1
End If
Next
End Sub
' shuffle plan
Sub shuffle(plan As Variant, i As Integer)
Dim tmp As String, n As Integer, j As Integer, k As Integer
For n = 1 To i
' random choose elements to shuffle
retry:
k = Int(1 + Rnd() * UBound(plan))
j = Int(1 + Rnd() * UBound(plan))
If k = j Then GoTo retry
tmp = plan(k, 2)
plan(k, 2) = plan(j, 2)
plan(j, 2) = tmp
Next
End Sub
' generate test data
Sub testdata()
Dim ws As Worksheet, i As Long
Set ws = ThisWorkbook.Sheets("Sheet1")
For i = 2 To 150001
ws.Cells(i, 2) = i - 1
ws.Cells(i, 3) = "Loc" & 1 + Int(Rnd() * 6)
Next
End Sub
I am writing a VB application for connecting to a sap system (using rfc).
Everything works fine and I do get a connection and the data as well.
Nevertheless the code for saving the accessed data and writing it to a excel file is really slow.
After the connection I call RFC_READ_TABLE, which returns with a result in <5 secs, which is perfect. Writing to excel (cell by cell) is pretty slow.
Is there any way to 'export' the whole tblData to excel and not being dependent on writing cell by cell?
Thanks in advance!
If RFC_READ_TABLE.Call = True Then
MsgBox tblData.RowCount
If tblData.RowCount > 0 Then
' Write table header
For j = 1 To Size
Cells(1, j).Value = ColumnNames(j)
Next j
Size = UBound(ColumnNames, 1) - LBound(ColumnNames, 1) + 1
For i = 1 To tblData.RowCount
DoEvents
Textzeile = tblData(i, "WA")
For j = 1 To Size
Cells(i + 1, j).Value = LTrim(RTrim(getPieceOfTextzeile(Textzeile)))
Next j
Next
Else
MsgBox "No entries found in system " & SYSID, vbInformation
End If
Else
MsgBox "ERROR CALLING SAP REMOTE FUNCTION CALL"
End If
Arrays: Faster Than Ranges
If the data was ready (need not to be processed) something like this could be a solution:
Sub Sap()
Const cStrStart As String = "A1" 'First cell of the resulting data
Dim tbldata
Dim arrSap As Variant 'Will become a one-based two dimensional array
Dim oRng As Range
arrSap = tbldata 'Data is in the array.
'Calculate the range: Must be the same size as arrSap
Set oRng = Range(Cells(Range(cStrStart).Row, UBound(arrSap)), _
Cells(Range(cStrStart)).Column, UBound(arrSap, 2))
oRng = arrSap 'Paste array into range.
End Sub
Since you need to process your data from tbldata do what you do not to the range, but to an array which should be much faster:
Sub Sap()
Const cStrStart As String = "A1" 'First cell of the resulting data
Dim arrSap() As Variant
Dim oRng As Range
Dim Size As Integer
If RFC_READ_TABLE.Call = True Then
'-------------------------------------------------------------------------------
MsgBox tbldata.RowCount
If tbldata.RowCount > 0 Then
Size = UBound(ColumnNames, 1) - LBound(ColumnNames, 1) + 1
ReDim arrSap(1 To tbldata.RowCount + 1, 1 To Size) '+ 1 for header
' Write table header
For j = 1 To Size
arrSap(1, j).Value = ColumnNames(j)
Next j
' Write data
For i = 1 + 1 To tbldata.RowCount + 1 '+ 1 for header
DoEvents
'- 1 due to header, don't know what "WA" is
Textzeile = tbldata(i - 1, "WA")
For j = 1 To Size
arrSap(i, j) = _
LTrim(RTrim(getPieceOfTextzeile(Textzeile)))
Next j
Next
'-------------------------------------------------------------------------------
'Calculate the range: Must be the same size as arrSap
Set oRng = Range(Cells(Range(cStrStart).Row, Range(cStrStart).Column), _
Cells(UBound(arrSap) + Range(cStrStart).Row -1, _
UBound(arrSap, 2) + Range(cStrStart).Column -1))
oRng = arrSap
'-------------------------------------------------------------------------------
Else
MsgBox "No entries found in system " & SYSID, vbInformation
End If
Else
MsgBox "ERROR CALLING SAP REMOTE FUNCTION CALL"
End If
End Sub
Now adjust the cStrStart, check the rest of the code and you're good to go.
I haven't created a working example so I edited this code a few times. Check it carefully not to lose data.
I have a large table, sometimes with hundreds of rows.
This table is generated by another application that exports to excel.
One column has the heading "Adjusted Price".
I want all the rows in this column to contain a formula (presently they're all 0's).
I want to automate this process because the table gets regenerated all the time.
This column may not always be in the same place. So I need a macro that can find this column ("Adjusted Price") and then fill all the cells in that column with a formula (with the exception of the first row of course).
Can this be done?
Thanks in advance :)
Your homework is to figure out how to plug it in!
Option Explicit
Sub setAdjustedPrice()
Dim column As Integer
Dim adjustedPriceColumn As String
Dim found As Boolean
Dim rowCount As Long
column = 1
rowCount = 1
Do While Range(FncAlphaCon(column) & rowCount).Value <> "" And found = False
If (Range(FncAlphaCon(column) & rowCount).Value = "Adjusted Price") Then
found = True
adjustedPriceColumn = FncAlphaCon(column)
Else
column = column + 1
End If
Loop
If found = True Then
Do While rowCount < ActiveSheet.UsedRange.Rows.count
rowCount = rowCount + 1
Range(adjustedPriceColumn & rowCount) = "YOUR FORMULA"
Loop
Else
MsgBox ("'Adjusted Price' column not found, cannot continue.")
End If
End Sub
Private Function FncAlphaCon(aNumber As Integer) As String
' Fixed version 27/10/2011
Dim letterArray As String
Dim iterations As Integer
letterArray = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
If aNumber <= 26 Then
FncAlphaCon = (Mid$(letterArray, aNumber, 1))
Else
If aNumber Mod 26 = 0 Then
iterations = Int(aNumber / 26)
FncAlphaCon = (Mid$(letterArray, iterations - 1, 1)) & (Mid$(letterArray, 26, 1))
Else
'we deliberately round down using 'Int' as anything with decimal places is not a full iteration.
iterations = Int(aNumber / 26)
FncAlphaCon = (Mid$(letterArray, iterations, 1)) & (Mid$(letterArray, (aNumber - (26 * iterations)), 1))
End If
End If
End Function
I am having trouble determining a way to enter a 1 or 0 into an adjacent cell to indicate whether or not a value is unique when working with a large dataset. I have read of multiple methods for accomplishing this, however none of them seem efficient for my purposes: I am using an instance of Excel 2010 (so I do not have the Distinct Count feature in PivotTables, and when I try to use PowerPivot it crashes my file due to processing limitations.
In this StackOverflow question: Simple Pivot Table to Count Unique Values there are suggestions to use SUMPRODUCT or COUNTIF, but when working with 50,000+ rows as I am, this causes terrible performance and a file size of ~35 MB instead of ~3 MB. I wanted to know if there is a better solution for a large, dynamic dataset whether it is a formula or VBA.
An example of what I would like to accomplish is (with the Unique column being the adjacent cell):
Name Week Unique
John 1 1
Sally 1 1
John 1 0
Sally 2 1
I attempted to script the same functionality of COUNTIF but with no success:
For Each Cell In ThisWorkbook.Worksheets("Overtime & Type Data").Range("Z2:Z" & DataLastRow)
If Worksheets("Overtime & Type Data").Cells(Cell.Row, 26) <> Worksheets("Overtime & Type Data").Cells(Cell.Row - 1, 26) Then
FirstCell = Cell.Row
End If
If (Worksheets("Overtime & Type Data").Range(Cells(FirstCell, 26), Cells(Cell.Row, 26)) = Worksheets("Overtime & Type Data").Range(Cells(Cell.Row, 26))) = True Then
Cell.Value = 1
Else
Cell.Value = 0
End If
Next Cell
This code ran on over 130,000 rows successfully in less than 3 seconds. Adjust the column letters to fit your dataset.
Sub tgr()
Const colName As String = "A"
Const colWeek As String = "B"
Const colOutput As String = "C"
Dim ws As Worksheet
Dim rngData As Range
Dim DataCell As Range
Dim rngFound As Range
Dim collUniques As Collection
Dim arrResults() As Long
Dim ResultIndex As Long
Dim UnqCount As Long
Set ws = ThisWorkbook.Sheets("Overtime & Type Data")
Set rngData = ws.Range(colName & 2, ws.Cells(Rows.Count, colName).End(xlUp))
Set collUniques = New Collection
ReDim arrResults(1 To rngData.Cells.Count, 1 To 1)
On Error Resume Next
For Each DataCell In rngData.Cells
ResultIndex = ResultIndex + 1
collUniques.Add ws.Cells(DataCell.Row, colName).Value & ws.Cells(DataCell.Row, colWeek).Value, ws.Cells(DataCell.Row, colName).Value & ws.Cells(DataCell.Row, colWeek).Value
If collUniques.Count > UnqCount Then
UnqCount = collUniques.Count
arrResults(ResultIndex, 1) = 1
Else
arrResults(ResultIndex, 1) = 0
End If
Next DataCell
On Error GoTo 0
ws.Cells(rngData.Row, colOutput).Resize(rngData.Cells.Count).Value = arrResults
End Sub
One approach is to sort by Name and Week. Then you can determine Unique for any row by comparing with the previous row.
If you need to preserve the order, you could first write a column of Index numbers (1, 2, 3, ...) to keep track of order. After calculating Unique, sort by Index to restore the original order.
The whole process could be done manually with relatively few steps, or automated with VBA.
I'm not sure how well this will work with 50000 values, but it goes through ~1500 in about a second.
Sub unique()
Dim myColl As New Collection
Dim isDup As Boolean
Dim myValue As String
Dim r As Long
On Error GoTo DuplicateValue
For r = 1 To Sheet1.UsedRange.Rows.Count
isDup = False
'Combine the value of the 2 cells together
' and add that string to our collection
'If it is already in the collection it errors
myValue = Sheet1.Cells(r, 1).Value & Sheet1.Cells(r, 2).Value
myColl.Add r, myValue
If isDup Then
Sheet1.Cells(r, 3).Value = "0"
Else
Sheet1.Cells(r, 3).Value = "1"
End If
Next
On Error GoTo 0
Exit Sub
DuplicateValue:
'The value is already in the collection so put a 0
isDup = True
Resume Next
End Sub
Just about any bulk operation will beat a loop involving worksheet cells. You might be able to trim the time down a bit by performing all of the calculations in memory and only returning the values back to the worksheet en masse when it is complete.
Sub is_a_dupe()
Dim v As Long, vTMP As Variant, vUNQs As Variant, dUNQs As Object
Debug.Print Timer
On Error GoTo bm_Uh_Oh
Set dUNQs = CreateObject("Scripting.Dictionary")
With Worksheets("Sheet1")
vTMP = .Range(.Cells(2, 1), .Cells(Rows.Count, 2).End(xlUp)).Value2
ReDim vUNQs(1 To UBound(vTMP, 1), 1 To 1)
For v = LBound(vTMP, 1) To UBound(vTMP, 1)
If dUNQs.Exists(Join(Array(vTMP(v, 1), vTMP(v, 2)))) Then
vUNQs(v, 1) = 0
Else
dUNQs.Add Key:=Join(Array(vTMP(v, 1), vTMP(v, 2))), _
Item:=vTMP(v, 2)
vUNQs(v, 1) = 1
End If
Next v
.Cells(2, 3).Resize(UBound(vUNQs, 1), 1) = vUNQs
End With
Debug.Print Timer
bm_Uh_Oh:
dUNQs.RemoveAll
Set dUNQs = Nothing
End Sub
Previous experience tells me that the variety of data (as well as hardware, etc) will impact timing the process but in my random sample data I received these elapsed times.
50K records ..... 0.53 seconds
130K records .... 1.32 seconds
500K records .... 4.92 seconds