Optimize Excel VBA Code - excel

I have the following VBA code within excel. It's goal is to remove a row if the given text is found, as well as remove the row directly below it. It needs to scan roughly 700k rows and is taking roughly an hour to do 100k rows. Does anyone see any optimization?
Sub RemovePageHeaders()
Application.ScreenUpdating = False
Dim objRange As Range
Set objRange = Cells.Find("HeaderText")
While objRange <> ""
objRange.Offset(1, 0).Rows(1).EntireRow.Delete
objRange.Rows(1).EntireRow.Delete
Set objRange = Cells.Find("HeaderText")
Wend
MsgBox ("I'm done removing page headers!")
End Sub
Thanks in advance!

Try the following sub. It loops from the bottomm-most row to the top, checking column 3 for "HeaderText". If that's found, it delete the row and the one below it. On a C2D E8500 with 2 gigs of RAM it takes just over a minute per 100,000 rows on a sheet with 1 million rows.
Sub RemoveHeaders()
Dim i As Long
Application.ScreenUpdating = False
Debug.Print "Started: " & Now
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
If ActiveSheet.Cells(i, 3) = "HeaderText" Then
ActiveSheet.Range(i & ":" & i + 1).EntireRow.Delete
End If
Next i
Application.ScreenUpdating = True
Debug.Print "Finished: " & Now
End Sub
EDIT
For a slightly ghetto but possibly much faster solution try this:
Change the constant in the below code to the number of the first column that's blank in every row. For example if your data takes up columns A-F, you want the constant to be 7 (column G).
Run the code, it will put the row number next to every entry. Should take around 30 seconds.
Sort the ENTIRE data by column C; this should take less than a minute.
Find "HeaderText" visually, select and delete all the rows.
Sort by your row-numbered column ("G" in my example).
Delete the row-numbered column (again, "G" in my example).
Sub NumberColumns()
Const BLANK_COLUMN = 7
Dim i As Long
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
ActiveSheet.Cells(i, BLANK_COLUMN) = i
Next i
Debug.Print "done"
End Sub

Even if it doesn't fully answer the question, it may help any reader so...
There are several tips on the web about optimizing vba. In particular, you can do:
'turn off some Excel functionality so your code runs faster
'these two are especially very efficient
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'use these if you really need to
Application.DisplayStatusBar = False
Application.EnableEvents = False 'is very efficient if you have ANY event associated with what your macro is going to do
'code goes here
'at the end, don't forget to restore the default behavior
'calculate the formulas
Application.Calculate
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
Application.DisplayStatusBar = True
Application.EnableEvents = True
See here for more information

Putting this entry in a little late. It should be about 2X faster than the accepted solution. I used my XP Excel 2003 computer with 1 gig to figure it out.
Sub DeleteHeaderText()
Dim bUnion As Boolean
Dim d1 As Double
Dim l As Long
Dim rDelete As Range
Dim wks As Worksheet
Dim vData As Variant
d1 = Timer
Application.EnableEvents = False
Application.ScreenUpdating = False
bUnion = False
Set wks = ActiveSheet
lEnd = ActiveSheet.UsedRange.Rows.Count
vData = wks.Range("C1:C" & lEnd).Value2
For l = 1 To lEnd
If vData(l, 1) = "HeaderText" Then
If bUnion Then
Set rDelete = Union(rDelete, wks.Range("A" & l, "A" & l + 1))
Else
Set rDelete = wks.Range("A" & l, "A" & l + 1)
bUnion = True
End If
l = l + 1
End If
Next l
Debug.Print Timer() - d1
rDelete.EntireRow.Delete
Debug.Print Timer() - d1
End Sub

I know this is late, but if I understand your problem, then you are deleting rows based on a "HeaderText" in column C. So, since i didn't look at your data, i created my own. I created 700,000 rows and every 9th row contained the "HeaderText" string. It deleted ~233k rows ("HeaderText" row + row before + row after) and ran in 2.2 seconds on my computer. Give it a try!!
Private Declare Function timeGetTime Lib "winmm.dll" () As Long
Sub DeleteHeaders()
Dim LastRow As Long
Dim I As Long
Dim WkSheet As Excel.Worksheet
Dim VArray As Variant
Dim NewArray() As String
Dim BooleanArray() As Boolean
Dim NewArrayCount As Long
Dim J As Long
Dim T As Double
Dim DeleteRowCount As Long
T = timeGetTime
With Application
.ScreenUpdating = False
.EnableEvents = False
.Calculation = xlCalculationManual
End With
Set WkSheet = ThisWorkbook.Sheets("Sheet1")
With WkSheet.UsedRange
LastRow = .Rows.Count
VArray = .Value
End With
ReDim BooleanArray(0 To UBound(VArray, 1) - 1), NewArray(UBound(VArray, 1) - 1, 0 To UBound(VArray, 2))
For I = 1 To UBound(VArray, 1)
If InStrB(1, VArray(I, 3), "HeaderText", vbBinaryCompare) <> 0 Then
BooleanArray(I - 1) = Not BooleanArray(I - 1)
BooleanArray(I) = Not BooleanArray(I)
BooleanArray(I + 1) = Not BooleanArray(I + 1)
End If
Next I
For I = LBound(BooleanArray, 1) To UBound(BooleanArray, 1)
If BooleanArray(I) = False Then
For J = LBound(VArray, 2) To UBound(VArray, 2)
NewArray(NewArrayCount, J - 1) = VArray(I + 1, J)
Next J
NewArrayCount = NewArrayCount + 1
Else
DeleteRowCount = DeleteRowCount + 1
End If
Next I
With WkSheet
.Cells.Delete
.Range("a1:c" & NewArrayCount).Value = NewArray
End With
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
End With
Erase NewArray, BooleanArray, VArray
MsgBox "Deleted " & DeleteRowCount & " rows." & vbNewLine & vbNewLine & _
"Run time: " & Round((timeGetTime - T) / 1000, 3) & " seconds.", vbOKOnly, "RunTime"
End Sub

Here's a solution that will run on 100k rows in about 5-20 seconds depending on how many occurances of 'HeaderText' you have. As you requested, it will delete both the row with HeaderText in the C column as well as the row directly above it.
Update:
As it's been pointed out, this works on smaller data sets up to about 100k, but on larger sets it's really doesn't. Back to the drawing board :)
Sub DeleteHeaders()
Application.ScreenUpdating = False
Dim lastRow As Long
Dim varray As Variant
lastRow = Range("C" & Rows.Count).End(xlUp).Row
On Error Resume Next
varray = Range("C1:C" & lastRow).Value
For i = UBound(varray, 1) To 1 Step -1
If varray(i, 1) = "HeaderText" Then
Range("C" & i - 1, Range("C" & i)).EntireRow.Delete
i = i - 1
End If
Next
Application.ScreenUpdating = True
End Sub
How it works:
By dumping the entire C column into a variant array and working from it within excel, you get major speed increase. The varray is laid out like (1, 1), (2, 1), (3, 1) with the first number being the row number, so all you have to do is loop through it backwards. The key is making sure to delete both rows at the same time and decrementing i by one more.

The following is code lifted from a Bill Jelen book that is fantastic for this purpose.
Use a column (column A for my code) with some logic to determine if a row should be hidden on not.
Use the following formula in all applicable cells in that column
=IF(test TRUE to hide, 1, "keep")
Now use the VBA below
Range("A1:A10000").SpecialCells(xlCellTypeFormulas, xlNumbers).EntireRow.Delete
This selects all rows with a number returned by the formula at once, which is exactly the rows you want to delete. No looping required!

Here on my blog have a scripts for this:
Sample One:
Sub DelBlankRows()
Range("D1:D" & Cells _
(Rows.Count,2).End(xlUp).Row).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
End Sub
Sample two:
Sub DeleteRowsWithSpecifiedData()
'Looks in Column D and requires Column IV to be clean
Columns(4).EntireColumn.Insert
With Range("D1:D" & ActiveSheet.UsedRange.Rows.Count)
.FormulaR1C1 = "=IF(RC[1]="""","""",IF(RC[1]=""Not Needed"",NA()))"
.Value = .Value
On Error Resume Next
.SpecialCells(xlCellTypeConstants, xlErrors).EntireRow.Delete
End With
On Error GoTo 0
Columns(4).EntireColumn.Delete
End Sub

Related

Vba copy paste slows with more records

I'm trying to help someone who has to go through 80k rows on excel between two sheets and identify differences and then load the changed records into a database.
The below code works but slows down significantly with bigger data set, at 10k rows it takes 00:02:22 but with 20k it takes 00:10:13, full 80k rows takes under 2 hours which is still a lot faster than someone doing it manually over a day but I hoping someone can tell me what can potentially be impacting the performance with a higher number of records and how I can solve it?
Sub Button1_Click()
'Option Explicit
Application.ScreenUpdating = False
Application.EnableEvents = False
Set Day1_Sheet = ThisWorkbook.Sheets("Day1")
Set Day2_Sheet = ThisWorkbook.Sheets("Day2")
Set VBA_Export = ThisWorkbook.Sheets("VBA_Export")
Dim Day1Code, Day2Code As String
Dim Day1CodeRow As Long, Day2CodeRow As Long, CurrentRow As Long, CurrentColumn As Long, AccountsN As Long, n As Long
Dim LastEmptyColumnResult As Long, LastEmptyRowResult As Long
Dim BolUpdated As Boolean
Dim cTime, eTime As Variant
Day1_Sheet_Rows = Day1_Sheet.Cells(Rows.Count, "B").End(xlUp).Row
Day2_Sheet_Rows = Day2_Sheet.Cells(Rows.Count, "B").End(xlUp).Row
LastEmptyColumnResult = 4
LastEmptyRowResult = 2
BolUpdated = False
VBA_Export.Range("A2:E10000").Clear
cTime = Now()
For Each c In Day1_Sheet.Range("B2:B" & Day1_Sheet_Rows)
BolUpdated = False
Day1Code = c
For Each e In Day2_Sheet.Range("B2:B" & Day2_Sheet_Rows)
If c = e Then
Day2Code = e
Day2CodeRow = e.Row
CurrentRow = c.Row
Exit For
End If
Next e
CurrentColumn = 3
While CurrentColumn <> 17
If Day1_Sheet.Cells(CurrentRow, CurrentColumn).Value = Day2_Sheet.Cells(Day2CodeRow, CurrentColumn).Value Then
Else
If BolUpdated Then
Else
Day2_Sheet.Rows(Day2CodeRow).EntireRow.Copy VBA_Export.Range("A" & LastEmptyRowResult)
LastEmptyRowResult = LastEmptyRowResult + 1
BolUpdated = True
End If
End If
CurrentColumn = CurrentColumn + 1
Wend
Next c
LastLine:
Set Day1_Sheet = Nothing
Set Day2_Sheet = Nothing
eTime = Now()
MsgBox ("Start Time " & cTime & ".End Time " & eTime)
Debug.Print "Elapsed Time " & eTime - cTime
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
See below for suggestions for speeding up your process - basically do everything you can using arrays and avoid cell-by-cell access.
Sub Button1_Click()
Const NUM_COLS As Long = 16 'number of columns in your datasets
Dim Day1_Sheet As Worksheet, Day2_Sheet As Worksheet, VBA_Export As Worksheet
Dim data1, data2, destRow As Long, changed As Boolean, rw1 As Long, rw2 As Variant
Dim col As Long, cTime, eTime
Set Day1_Sheet = ThisWorkbook.Sheets("Day1")
Set Day2_Sheet = ThisWorkbook.Sheets("Day2")
Set VBA_Export = ThisWorkbook.Sheets("VBA_Export")
'load both datasets into arrays for faster access
data1 = Day1_Sheet.Range("A1").Resize(Day1_Sheet.Cells(Rows.Count, "B").End(xlUp).Row, NUM_COLS).Value
data2 = Day2_Sheet.Range("A1").Resize(Day2_Sheet.Cells(Rows.Count, "B").End(xlUp).Row, NUM_COLS).Value
VBA_Export.Range("A2:E10000").Clear
destRow = 2
cTime = Now()
GoFast 'turn on speed enhancements
For rw1 = 2 To UBound(data1, 1) 'loop over Day1 data
'try to match on colB - using Match on the worksheet is quite fast
rw2 = Application.Match(data1(rw1, 2), Day2_Sheet.Columns("B"), 0) 'find matching row...
If Not IsError(rw2) Then 'got a match on Day2 ?
changed = False 'reset flag
For col = 3 To NUM_COLS 'loop over columns
If data1(rw1, col) <> data2(rw2, col) Then
changed = True 'flag row as changed
Exit For 'no need to check further
End If
Next col
If changed Then 'Day2 is different?
Day2_Sheet.Rows(rw2).Copy VBA_Export.Cells(destRow, "A")
destRow = destRow + 1 'next paste row
End If
Else
'no Col B match was found. Do something?
End If
Next rw1
GoFast False 'turn off speed enhancements
eTime = Now()
MsgBox ("Start Time " & cTime & ".End Time " & eTime)
Debug.Print "Elapsed Time " & eTime - cTime
End Sub
'maximize code speed by turning off unneeded stuff
'******** must reset !!!!
Sub GoFast(Optional bYesNo As Boolean = True)
With Application
.ScreenUpdating = Not bYesNo
.Calculation = IIf(bYesNo, xlCalculationManual, xlCalculationAutomatic)
End With
End Sub

Run through a range of cells and change the date of a linked cell more efficiently

I have a macro which will populate a range of around 216 cells, and 25/30 sheets with an index + match lookup to a separate spreadsheet for each column. (See below table.)
The more months that proceed, the larger this will get.
Each month the string that will be looked up will vary, and I need to show the evolution of elements pertaining to this string for each month since an inception date.
I tried two ways, both of which are fairly slow (typically ~30 secs to populate a sheet).
The first method populates the required range with the lookup formula, and then loops through using the replace function to update the formula to include the columns date.
E.g. pseudo-code.
For j = 1 To 10
s_Date = Format(ws.cells(1,1).value,"yyyy.mm.dd") 'eg say 5/31/2019
s_DStep = Format(ws.cells(1,j).value,"yyyy.mm.dd")'eg say 6/30/2019
For k = 10 To 17
s_formula = ws.Cells(k, j).Formula
s_formula = Replace(s_formula, s_Date, s_DStep)
ws.Cells(k, j).Formula = s_formula
Next k
Next j
The second method loops through the range, it will assign the cell with the required formula and date.
Simplified example:
For j = 1 To i_NumOfDates + 1
s_DStep = Format(ws.cells(1,j).value,"yyyy.mm.dd")'eg say 6/30/201
ws.Cells(10, j + 1) = "=INDEX('" & s_Dir & s_DStep & "\[ Workbook]Sheet1'!$E$1:$E$9999,MATCH($B$2,'" _
& s_Dir & s_DStep & "\["Workbook]Sheet1'!$a$1:$a$9999,0))"
next j
Neither of these methods are quick enough.
I appreciate that lookup formulas are taxing in terms of computing power.
I have done the below.
'Turn off Screen Update
Application.ScreenUpdating = False
'Turn off Automatic Calculation
Application.Calculation = xlManual
'Turn off display alert
Application.DisplayAlerts = False
5/31/2019
6/30/2019
=Index(dir & 5.31.2019 & range1, match(str,dir & 5.31.2019 & range2),0)
=Index(dir & 6.30.2019 & range1, match(str,dir & 6.30.2019 & range2),0)
You could try copying the formulas to an array, processing it and copying it back to the sheet
Option Explicit
Sub demo()
Dim ar1, ar, j As Long, k As Long
Dim s_Date As String, s_DStep As String
Dim rng As Range, ws As Worksheet
Set ws = Sheet1
Set rng = ws.Range("A10:J17")
ar = rng.Formula
ar1 = ws.Range("A1:J1").Value
s_Date = Format(ar1(1, 1), "yyyy.mm.dd") 'eg say 5/31/2019
For j = 1 To UBound(ar, 2) '
s_DStep = Format(ar1(1, j), "yyyy.mm.dd") 'eg say 6/30/2019
For k = 1 To UBound(ar)
ar(k, j) = Replace(ar(k, j), s_Date, s_DStep)
Next k
Next j
rng.Formula = ar
End Sub

Macro to Count Filter Distinct unique Value

I Have Table like this, where i have to use macro because my table always change Every day (SSAS)
so i have use macro to filter automatically,
I am able to sum Amount based on same Vendorname, PONuber and Date on Column E (Subtotal).
and then filter to show Subtotal AMount >500
I want to show only row >500 (Column E), and pop up message to count PONumber (Column B) how many Unique PO Number (Only Visible Row to count)
i've been stuck how to count only Visible Unique PO Number and show it on Pop Up message
this is my Macro
Sub FilterCOunt_Click()
Dim Condition As Variant
Dim AVal As Variant
Dim LastRow As Long
Dim Hide, popup As Long
Dim message As String
Dim sht As Worksheet
'----------------------------
Dim dictionary As Object
Set dictionary = CreateObject("scripting.dictionary")
'---------------------------
Application.ScreenUpdating = False
Application.EnableEvents = False
Application.AskToUpdateLinks = False
Application.DisplayAlerts = False
Application.Calculation = xlAutomatic
Application.ScreenUpdating = False
Application.StatusBar = False
'------------------
Columns.EntireColumn.Hidden = False
Rows.EntireRow.Hidden = False
Columns("E:Z").EntireColumn.Delete
Range("E:Z").EntireColumn.Insert
Range("E1").Value = "Sub Total >500 "
Set sht = ActiveSheet
LastRow = sht.Range("B" & Rows.Count).End(xlUp).Row
'-------------------
For i = 2 To LastRow ' with last row count =SUMIFS(I:I,A:A,A8,B:B,B8,C:C,C8)
AVal = "A" & i
BVal = "B" & i
CVal = "C" & i
Worksheets("Sheet3").Range("E" & i).Formula = "=SUMIFS(D:D,A:A," & AVal & ",B:B," & BVal & ",C:C," & CVal & ")"
Next i
With sht.Range("E1:E" & LastRow)
.AutoFilter
.AutoFilter field:=1, Criteria1:=">=500"
End With
'----------Count Pop UP
Dim CountPO As Long
Range("G1").FormulaArray = "=SUM(IF(FREQUENCY(IF(SUBTOTAL(3,OFFSET(B2,ROW(B2:B22)-ROW(B2),1)),IF(B2:B22<>"",MATCH(""&B2:B22,B2:B22&"",0))),ROW(B2:B22)- ROW(B2)+1),1))"
MsgBox "We Found " & CountPO & " PO Open(s)", _
vbInformation, "PO Found"
End Sub
and this is the formula to count it
{=SUM(IF(FREQUENCY(IF(SUBTOTAL(3,OFFSET(B2,ROW(B2:B22)-ROW(B2),1)),IF(B2:B22<>"",MATCH("~"&B2:B22,B2:B22&"",0))),ROW(B2:B22)-ROW(B2)+1),1))}
If you are pulling from a Database via SSAS you can use Power Query to link to your SSAS DataModel to Excel and you can insert a Calculated Measure in Dax from there with DistinctCount.
Count:=Calculate(DistinctCount(TableName[PONumber]),TableName[Amount]>500)
Alternatively if you want total insights on your specified issue you can add a measured column and then you can use Power Pivot to filter for your criteria live on refresh to the data model, completely negating the need for VBA entirely.
Incidentally it is pertinent to remember VBA is the sledge hammer of solutions please use the DataModel Tools before you ever think of a macro solution remember, VBA is an Application Programming Language and many IT Security Systems will disable it because it opens the system up for malware, you can literally change any file or program in VBA including calling delete system files
Meanwhile having a set DataModel in a locked file that requires user access behind LAN security is easily more secure than allowing your computer to have open programatic access.
This is an alternative formula (which doesn't require any filtering)
=SUM(--(FREQUENCY(IF(B2:B20>500,MATCH(A2:A20,A2:A20,0)),ROW(A2:A20)-ROW(A2)+1)>0))
It's an array formula so using VBA
Range("E1").FormulaArray = "=SUM(--(FREQUENCY(IF(B2:B20>500,MATCH(A2:A20,A2:A20,0)),ROW(A2:A20)-ROW(A2)+1)>0))"
A formula for your cell E2, which is not an array formula, is
=SUMPRODUCT((B2=B$2:B$23)*(A2=A$2:A$23)*(D$2:D$23))
Copy it down, as usual.
See here for why not using an array formula (if you have an alternative).
I am not certain this solves your question, as I did not fully understand it.
You can use the following code. I have implemented Collection to get the unique count.
This will count the unique rows in B column where value in E column > 500.
Private Sub GetUniqueCount() AS Variant
Dim Test As New Collection
Dim rng As Range
For i = 2 To 6 'Replace 6 with last row(without filtration)
Value = Cells(i, "B").Value
check = Contains(Test, Value)
If Cells(i, "E").Value > 500 And Not check And Len(Value) > 0 Then
Test.Add Value, CStr(Value)
End If
Next i
GetUniqueCount = Test.count
End Sub
'Function to check if the value exists in Collection or not
Public Function Contains(col As Collection, key As Variant) As Boolean
Dim obj As Variant
On Error GoTo err
Contains = True
obj = col(key)
Exit Function
err:
Contains = False
End Function
Step 1: Post my code to a new module.
Step 2: Bind you button to the macro named "filterAndCount"
Step 3: Click the buton and rejoice :-)
Code description:
1) The code loops all the rows in the table.
2) First it checks if the Sub Total is above the limit (500).
3) If equal or below it hides the row and moves on to the next row.
4) If above it checks if the value already exists in the array values above.
5) If it does not exists then the value is added to the array.
6) When all rows have been looped only rows with a Sub Total above the limit is visible.
7) Only the unique and visible PO Numbers have been added to the array.
8) The number of values in the array is dispayed in a message box.
Dim wb As Workbook
Dim ws As Worksheet
Dim i As Double
Dim n As Double
Dim subTotalLimit As Double
Dim arr() As String
Sub filterAndCount()
Set wb = ThisWorkbook
Set ws = wb.ActiveSheet
i = 2
subTotalLimit = 500
n = 0
ReDim arr(0 To 0) As String
arr(0) = 0
ws.Columns("E:Z").EntireColumn.Delete
ws.Range("E:Z").EntireColumn.Insert
ws.Range("E1").Value = "Sub Total >500 "
Do While ws.Range("B" & i) <> ""
ws.Range("E" & i).Formula = "=SUMIFS(D:D,A:A,A" & i & ",B:B,B" & i & ",C:C,C" & i & ")"
If ws.Range("E" & i) < subTotalLimit Then
ws.Range("B" & i).EntireRow.Hidden = True
Else
If Not IsNumeric(Application.Match(Range("B" & i).Text, arr(), 0)) Then
arr(n) = Range("B" & i).Value
n = UBound(arr) + 1
ReDim Preserve arr(0 To n) As String
arr(n) = 0
End If
End If
i = i + 1
Loop
MsgBox UBound(arr)
End Sub
Use 2 Dictionary Objects, one for totals and one for unique PO's
Sub filterCOunt()
Const LIMIT = 500
Dim wb As Workbook, ws As Worksheet
Dim iRow As Long, iLastRow As Long, amount As Single
Dim sVendor As String, sPO As String, msg As String, sKey As String
Dim dictPO As Object, dictTotal As Object
Set dictPO = CreateObject("Scripting.Dictionary")
Set dictTotal = CreateObject("Scripting.Dictionary")
Set wb = ThisWorkbook
Set ws = ActiveSheet
iLastRow = ws.Range("B" & Rows.Count).End(xlUp).Row
' first pass to total by po and vendor
For iRow = 2 To iLastRow
sVendor = Trim(ws.Cells(iRow, 1))
sPO = Trim(ws.Cells(iRow, 2))
amount = CSng(ws.Cells(iRow, 4))
sKey = sVendor & "_" & sPO
' sub total
If dictTotal.exists(sKey) Then
dictTotal(sKey) = dictTotal(sKey) + amount
Else
dictTotal.Add sKey, amount
End If
Next
' second pass for PO numbers
For iRow = 2 To iLastRow
sVendor = Trim(ws.Cells(iRow, 1))
sPO = Trim(ws.Cells(iRow, 2))
sKey = sVendor & "_" & sPO
' sub total
ws.Cells(iRow, 5) = dictTotal(sKey)
If dictTotal(sKey) > LIMIT Then
If Not dictPO.exists(sPO) Then
dictPO.Add sPO, iRow
End If
End If
Next
' filter
With ws.Range("E1:E" & iLastRow)
.AutoFilter
.AutoFilter field:=1, Criteria1:=">=" & LIMIT
End With
msg = "No of open PO's = " & dictPO.Count
MsgBox msg, vbInformation
End Sub
First, for your code Count Pop UP to work, let's change all from "" to """"
Second, to be able to notify a Unique PO Number and show it on Pop Up message, you must call the value received from cell G1, or, safer, use evaluate to get the result of this expression.
Your code will probably work now
'Dim CountPO As Long
Range("G1").FormulaArray = "=SUM(IF(FREQUENCY(IF(SUBTOTAL(3,OFFSET(B2,ROW(B2:B22)-ROW(B2),1)),IF(B2:B22<>"""",MATCH(""""&B2:B22,B2:B22&"""",0))),ROW(B2:B22)- ROW(B2)+1),1))"
MsgBox "We Found " & [g1].Value2 & " PO Open(s)", vbInformation, "PO Found"
however, your formula only counts all unique values including less than 500, in addition it is quite long. You can replace it using the shorter formula like the following code:
Dim formula_string As String
formula_string = "=SUMPRODUCT((B2:B22>3)*(C2:C22<>"""")/COUNTIF(B2:B22,B2:B22&""""))"
MsgBox "We Found " & Application.Evaluate(formula_string) & " PO Open(s)", vbInformation, "PO Found"
Hope it helps!

Removing duplicates based on their occurrence

I would like to check a certain column (W) for duplicates (number of occurrences is stored in another column (AZ)) and than delete all row this way:
Value is found two times in the column - delete only one row containing the value.
Value is found more times in the column - delete all the rows with the values.
My code works quite well but sometimes it doesn't delete all the duplicates as it should do. Any idea for improvement?
EDIT: The updated code works really good except that it always misses one duplicate and leaves it not deleted.
fin = ws.UsedRange.Rows.count
For i = 2 To fin
ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i))
Next i
For j = fin To 2 Step -1
If ws.Range("AZ" & j).value > 2 Then
ws.Range("AZ" & j).EntireRow.Delete
fin = ws.UsedRange.Rows.count
ElseIf ws.Range("AZ" & j).value = 2 Then
Set rng = Range("W:W").Find(Range("W" & j).value, , xlValues, xlWhole, , xlNext)
rngRow = rng.Row
If rngRow <> j Then
ws.Range("AZ" & rngRow) = "1"
ws.Range("AZ" & j).EntireRow.Delete
fin = ws.UsedRange.Rows.count
Else
MsgBox "Error at row " & rngRow
End If
End If
Next j
If speed is an issue, here is a method that should be faster, as it creates a collection of rows to be deleted, then deletes them. Since everything, except for the actual row deletion, is done in VBA, there are far fewer calls back and forth to the worksheet.
The routine could be sped up as noted in the inline comments.
If it is still too slow, depending on the size of the worksheet, it might be feasible to read the entire worksheet into a VBA Array; test for duplicates; write back the results to a new array and write that out to the worksheet. (If your worksheet is too large, this method might run out of memory, though).
In any event, we need both a Class Module which YOU must rename cPhrases, as well as a Regular Module
Class Module
Option Explicit
Private pPhrase As String
Private pCount As Long
Private pRowNums As Collection
Public Property Get Phrase() As String
Phrase = pPhrase
End Property
Public Property Let Phrase(Value As String)
pPhrase = Value
End Property
Public Property Get Count() As Long
Count = pCount
End Property
Public Property Let Count(Value As Long)
pCount = Value
End Property
Public Property Get RowNums() As Collection
Set RowNums = pRowNums
End Property
Public Function ADDRowNum(Value As Long)
pRowNums.Add Value
End Function
Private Sub Class_Initialize()
Set pRowNums = New Collection
End Sub
Regular Module
Option Explicit
Sub RemoveDuplicateRows()
Dim wsSrc As Worksheet
Dim vSrc As Variant
Dim CP As cPhrases, colP As Collection, colRowNums As Collection
Dim I As Long, K As Long
Dim R As Range
'Data worksheet
Set wsSrc = Worksheets("sheet1")
'Read original data into VBA array
With wsSrc
vSrc = .Range(.Cells(1, "W"), .Cells(.Rows.Count, "W").End(xlUp))
End With
'Collect list of items, counts and row numbers to delete
'Collection object will --> error when trying to add
' duplicate key. Use that error to increment the count
Set colP = New Collection
On Error Resume Next
For I = 2 To UBound(vSrc, 1)
Set CP = New cPhrases
With CP
.Phrase = vSrc(I, 1)
.Count = 1
.ADDRowNum I
colP.Add CP, CStr(.Phrase)
Select Case Err.Number
Case 457 'duplicate
With colP(CStr(.Phrase))
.Count = .Count + 1
.ADDRowNum I
End With
Err.Clear
Case Is <> 0 'some other error. Stop to debug
Debug.Print "Error: " & Err.Number, Err.Description
Stop
End Select
End With
Next I
On Error GoTo 0
'Rows to be deleted
Set colRowNums = New Collection
For I = 1 To colP.Count
With colP(I)
Select Case .Count
Case 2
colRowNums.Add .RowNums(2)
Case Is > 2
For K = 1 To .RowNums.Count
colRowNums.Add .RowNums(K)
Next K
End Select
End With
Next I
'Revers Sort the collection of Row Numbers
'For speed, if necessary, could use
' faster sort routine
RevCollBubbleSort colRowNums
'Delete Rows
'For speed, could create Unions of up to 30 rows at a time
Application.ScreenUpdating = False
With wsSrc
For I = 1 To colRowNums.Count
.Rows(colRowNums(I)).Delete
Next I
End With
Application.ScreenUpdating = True
End Sub
'Could use faster sort routine if necessary
Sub RevCollBubbleSort(TempCol As Collection)
Dim I As Long
Dim NoExchanges As Boolean
' Loop until no more "exchanges" are made.
Do
NoExchanges = True
' Loop through each element in the array.
For I = 1 To TempCol.Count - 1
' If the element is less than the element
' following it, exchange the two elements.
If TempCol(I) < TempCol(I + 1) Then
NoExchanges = False
TempCol.Add TempCol(I), after:=I + 1
TempCol.Remove I
End If
Next I
Loop While Not (NoExchanges)
End Sub
no need to use that inefficient second loop in the second section, just use a live count like so
fin = ws.UsedRange.Rows.count
For i = 2 To fin
ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i))
Next i
For j = fin To 2 Step -1
If ws.Range("AZ" & j).value > 2 OR Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & j)) = 2 Then
ws.Range("AZ" & j).EntireRow.Delete
End If
Next j
While your logic is basically sound, the method is not the most efficient. The AutoFilter Method can quickly remove all counts greater than 2 and the Range.RemoveDuplicates¹ method cansubsequently make quick work of removing one of the rows that still contain duplicate values in column W.
Dim r As Long, c As Long
With ws
If .AutoFilterMode Then .AutoFilterMode = False
r = .Cells.SpecialCells(xlLastCell).Row
c = Application.Max(52, .Cells.SpecialCells(xlLastCell).Column)
With .Range("A1", .Cells(r, c)) '.UsedRange
With .Columns(52)
If IsEmpty(.Cells(1, 1)) Then .Cells(1, 1) = "count"
With .Resize(.Rows.Count - 1, 1).Offset(1, 0)
.Cells.FormulaR1C1 = "=COUNTIF(C[-29], RC[-29])"
.Cells = .Cells.Value
End With
.AutoFilter field:=1, Criteria1:=">2"
With .Resize(.Rows.Count - 1, 1).Offset(1, 0)
If CBool(Application.Subtotal(103, .Cells)) Then
.SpecialCells(xlCellTypeVisible).EntireRow.Delete
End If
End With
.AutoFilter
End With
.RemoveDuplicates Columns:=23, Header:=xlYes
End With
End With
When you rewrite the count values in column AZ, you are likely going to rewrite 3 counts to 2, etc.
¹ The Range.RemoveDuplicates method removes duplicate rows from the bottom up.

Optimize Excel Formula that uses large arrays

I have used the below mentioned excel formula.
=INDEX(TABL,SMALL(IF(COUNTIF(H2,$A$1:$A$325779)*COUNTIF(I2,"<="&$B$1:$B$325779),ROW(TABL)-MIN(ROW(TABL))+1),1),3)
Where "TABL",a table, is A1:E325779 and is the source of my lookup array.
The formula mentioned is the exact requirement but is taking a lot of time to update the excel for 400,000+ cells containing this formula.
Can this be optimized?
Or can this be equated to a faster macro?
Its taking 1 second to update 1 cell!!! That's a very long time to update all 400K+ cells once!!!
Screenshot of a sample worksheet is as below.
I have based my program on Martin Carlsson's.
it is processing 100 records in 30 seconds. can it be improved?
Sub subFindValue()
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
Cells(2, 12) = Format(DateTime.Now, "yyyy-MM-dd hh:mm:ss")
Dim varRow As Variant
Dim varRowMain As Variant
Dim lookupTable As Variant
Dim lookupValueTable As Variant
lookupValueTable = Range("G2:J309011").Value
lookupTable = Range("A2:D325779").Value
varRowMain = 1
varRow = 1
Do Until varRowMain = 309011
Do Until varRow = 325779
If lookupTable(varRow, 1) = lookupValueTable(varRowMain, 1) And lookupTable(varRow, 2) >= lookupValueTable(varRowMain, 2) Then
lookupValueTable(varRowMain, 3) = lookupTable(varRow, 3)
lookupValueTable(varRowMain, 4) = lookupTable(varRow, 4)
Exit Do
End If
varRow = varRow + 1
Loop
If IsEmpty(lookupValueTable(varRowMain, 3)) Then
lookupValueTable(varRowMain, 3) = "NA_OX"
lookupValueTable(varRowMain, 4) = "NA_OY"
End If
varRowMain = varRowMain + 1
varRow = 1
Loop
Range("G2:J309011").Value = lookupValueTable
Cells(3, 12) = Format(DateTime.Now, "yyyy-MM-dd hh:mm:ss")
Application.ScreenUpdating = True
Application.DisplayStatusBar = True
Application.Calculation = xlCalculationAutomatic
Application.EnableEvents = True
End Sub
Is this what you need?
Sub subFindValue()
'Speed up
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
Dim strNamedValue As String: strNamedValue = Range("E3")
Dim curHigherThanValue As Currency: curHigherThanValue = Range("F3")
Dim varRow As Variant
varRow = 1
Do Until IsEmpty(Cells(varRow, 1))
If Cells(varRow, 1) = strNamedValue And Cells(varRow, 2) > curHigherThanValue Then
Range("G3") = Cells(varRow, 3)
Exit Do
End If
varRow = varRow + 1
Loop
'Slow down
Application.ScreenUpdating = True
Application.DisplayStatusBar = True
Application.Calculation = xlCalculationAutomatic
Application.EnableEvents = True
End Sub
This should work and be much faster then any VBA solution that would require looping every row as long as you can sort the date in Column B Descending:
Enter the following Formula As an Array (Instead of Enter use Ctrl+Shift+Enter
=INDEX($C$1:$C$15,MATCH(G2,IF($A$1:$A$15=F2,$B$1:$B$15),-1))
You should end up with something like:
Explanation:
IF($A$1:$A$15=F2,$B$1:$B$15)
Is building an array of values equal to the rows in column B where The Test word is in the same Row column A.
MATCH(G2,IF($A$1:$A$15=F2,$B$1:$B$15),-1)
This is using the Array built from the Id statement to find the smallest value greater than or equal to the Look up value from test data.
=INDEX($C$1:$C$15,MATCH(G2,IF($A$1:$A$15=F2,$B$1:$B$15),-1))
Once it is all together the 'INDEX' will return the value in Column C that is at the same position as the matched value.
UPDATE: If you are looking for what tigeravatar's Answer returns then here is another VBA function that will return all values:
Sub GetValues()
With Application
.ScreenUpdating = False
.EnableEvents = False
.Calculation = xlCalculationManual
End With
Dim strMetalName As String: strMetalName = [E3]
Dim dbMinimumValue As Double: dbMinimumValue = [F3]
Range("G3:G" & Rows.Count).ClearContents
With Range("TABL")
.AutoFilter Field:=1, Criteria1:=strMetalName
.AutoFilter Field:=2, Criteria1:=">=" & dbMinimumValue, Operator:=xlAnd
Range("C2", [C2].End(xlDown)).Copy [G3]
.AutoFilter
End With
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
End With
End Sub
For me his took 5-7 minutes to run while this took 1.5 seconds, where my first answer returns the single row containing the closest matching result this sub will return ALL values greater then or equal too.
If your data is sorted on column 2 within column 1 then the SpeedTools Filter.Ifs function would be considerable faster than your formula (at least 50 times faster)
=FILTER.IFS(2,$A$1:$C$325779,3,1,E3,2,">" & F3)
Disclaimer: I am the author of SpeedTools which is a commercial Excel addin product. You can download a full trial version from: http://www.decisionmodels.com/FastExcelV3SpeedTools.htm
You may need to adjust where the output goes (it assumes that the results should be output in cell G3 and down), but this should run pretty quickly:
Sub subFindValue()
Dim rngFound As Range
Dim arrResults() As Variant
Dim varFind As Variant
Dim dCompare As Double
Dim ResultIndex As Long
Dim strFirst As String
varFind = Range("E3").Text
dCompare = Range("F3").Value2
Range("G3:G" & Rows.Count).ClearContents
With Range("TABL").Resize(, 1)
Set rngFound = .Find(varFind, .Cells(.Cells.Count), xlValues, xlWhole)
If Not rngFound Is Nothing Then
ReDim arrResults(1 To WorksheetFunction.CountIf(.Cells, varFind), 1 To 1)
strFirst = rngFound.Address
Do
If rngFound.Offset(, 1).Value > dCompare Then
ResultIndex = ResultIndex + 1
arrResults(ResultIndex, 1) = rngFound.Offset(, 2).Text
End If
Set rngFound = .Find(varFind, rngFound, xlValues, xlWhole)
Loop While rngFound.Address <> strFirst
End If
End With
If ResultIndex > 0 Then Range("G3").Resize(ResultIndex).Value = arrResults
End Sub

Resources