Removing duplicates based on their occurrence - excel

I would like to check a certain column (W) for duplicates (number of occurrences is stored in another column (AZ)) and than delete all row this way:
Value is found two times in the column - delete only one row containing the value.
Value is found more times in the column - delete all the rows with the values.
My code works quite well but sometimes it doesn't delete all the duplicates as it should do. Any idea for improvement?
EDIT: The updated code works really good except that it always misses one duplicate and leaves it not deleted.
fin = ws.UsedRange.Rows.count
For i = 2 To fin
ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i))
Next i
For j = fin To 2 Step -1
If ws.Range("AZ" & j).value > 2 Then
ws.Range("AZ" & j).EntireRow.Delete
fin = ws.UsedRange.Rows.count
ElseIf ws.Range("AZ" & j).value = 2 Then
Set rng = Range("W:W").Find(Range("W" & j).value, , xlValues, xlWhole, , xlNext)
rngRow = rng.Row
If rngRow <> j Then
ws.Range("AZ" & rngRow) = "1"
ws.Range("AZ" & j).EntireRow.Delete
fin = ws.UsedRange.Rows.count
Else
MsgBox "Error at row " & rngRow
End If
End If
Next j

If speed is an issue, here is a method that should be faster, as it creates a collection of rows to be deleted, then deletes them. Since everything, except for the actual row deletion, is done in VBA, there are far fewer calls back and forth to the worksheet.
The routine could be sped up as noted in the inline comments.
If it is still too slow, depending on the size of the worksheet, it might be feasible to read the entire worksheet into a VBA Array; test for duplicates; write back the results to a new array and write that out to the worksheet. (If your worksheet is too large, this method might run out of memory, though).
In any event, we need both a Class Module which YOU must rename cPhrases, as well as a Regular Module
Class Module
Option Explicit
Private pPhrase As String
Private pCount As Long
Private pRowNums As Collection
Public Property Get Phrase() As String
Phrase = pPhrase
End Property
Public Property Let Phrase(Value As String)
pPhrase = Value
End Property
Public Property Get Count() As Long
Count = pCount
End Property
Public Property Let Count(Value As Long)
pCount = Value
End Property
Public Property Get RowNums() As Collection
Set RowNums = pRowNums
End Property
Public Function ADDRowNum(Value As Long)
pRowNums.Add Value
End Function
Private Sub Class_Initialize()
Set pRowNums = New Collection
End Sub
Regular Module
Option Explicit
Sub RemoveDuplicateRows()
Dim wsSrc As Worksheet
Dim vSrc As Variant
Dim CP As cPhrases, colP As Collection, colRowNums As Collection
Dim I As Long, K As Long
Dim R As Range
'Data worksheet
Set wsSrc = Worksheets("sheet1")
'Read original data into VBA array
With wsSrc
vSrc = .Range(.Cells(1, "W"), .Cells(.Rows.Count, "W").End(xlUp))
End With
'Collect list of items, counts and row numbers to delete
'Collection object will --> error when trying to add
' duplicate key. Use that error to increment the count
Set colP = New Collection
On Error Resume Next
For I = 2 To UBound(vSrc, 1)
Set CP = New cPhrases
With CP
.Phrase = vSrc(I, 1)
.Count = 1
.ADDRowNum I
colP.Add CP, CStr(.Phrase)
Select Case Err.Number
Case 457 'duplicate
With colP(CStr(.Phrase))
.Count = .Count + 1
.ADDRowNum I
End With
Err.Clear
Case Is <> 0 'some other error. Stop to debug
Debug.Print "Error: " & Err.Number, Err.Description
Stop
End Select
End With
Next I
On Error GoTo 0
'Rows to be deleted
Set colRowNums = New Collection
For I = 1 To colP.Count
With colP(I)
Select Case .Count
Case 2
colRowNums.Add .RowNums(2)
Case Is > 2
For K = 1 To .RowNums.Count
colRowNums.Add .RowNums(K)
Next K
End Select
End With
Next I
'Revers Sort the collection of Row Numbers
'For speed, if necessary, could use
' faster sort routine
RevCollBubbleSort colRowNums
'Delete Rows
'For speed, could create Unions of up to 30 rows at a time
Application.ScreenUpdating = False
With wsSrc
For I = 1 To colRowNums.Count
.Rows(colRowNums(I)).Delete
Next I
End With
Application.ScreenUpdating = True
End Sub
'Could use faster sort routine if necessary
Sub RevCollBubbleSort(TempCol As Collection)
Dim I As Long
Dim NoExchanges As Boolean
' Loop until no more "exchanges" are made.
Do
NoExchanges = True
' Loop through each element in the array.
For I = 1 To TempCol.Count - 1
' If the element is less than the element
' following it, exchange the two elements.
If TempCol(I) < TempCol(I + 1) Then
NoExchanges = False
TempCol.Add TempCol(I), after:=I + 1
TempCol.Remove I
End If
Next I
Loop While Not (NoExchanges)
End Sub

no need to use that inefficient second loop in the second section, just use a live count like so
fin = ws.UsedRange.Rows.count
For i = 2 To fin
ws.Range("AZ" & i).value = Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & i))
Next i
For j = fin To 2 Step -1
If ws.Range("AZ" & j).value > 2 OR Application.WorksheetFunction.CountIf(ws.Range("W2:W" & fin), ws.Range("W" & j)) = 2 Then
ws.Range("AZ" & j).EntireRow.Delete
End If
Next j

While your logic is basically sound, the method is not the most efficient. The AutoFilter Method can quickly remove all counts greater than 2 and the Range.RemoveDuplicates¹ method cansubsequently make quick work of removing one of the rows that still contain duplicate values in column W.
Dim r As Long, c As Long
With ws
If .AutoFilterMode Then .AutoFilterMode = False
r = .Cells.SpecialCells(xlLastCell).Row
c = Application.Max(52, .Cells.SpecialCells(xlLastCell).Column)
With .Range("A1", .Cells(r, c)) '.UsedRange
With .Columns(52)
If IsEmpty(.Cells(1, 1)) Then .Cells(1, 1) = "count"
With .Resize(.Rows.Count - 1, 1).Offset(1, 0)
.Cells.FormulaR1C1 = "=COUNTIF(C[-29], RC[-29])"
.Cells = .Cells.Value
End With
.AutoFilter field:=1, Criteria1:=">2"
With .Resize(.Rows.Count - 1, 1).Offset(1, 0)
If CBool(Application.Subtotal(103, .Cells)) Then
.SpecialCells(xlCellTypeVisible).EntireRow.Delete
End If
End With
.AutoFilter
End With
.RemoveDuplicates Columns:=23, Header:=xlYes
End With
End With
When you rewrite the count values in column AZ, you are likely going to rewrite 3 counts to 2, etc.
¹ The Range.RemoveDuplicates method removes duplicate rows from the bottom up.

Related

Updating the column based on Unique value in one col & max repeated values in another col

I am trying to convert the data based on the max repeated values.
I have truck numbers in col A and "Truck types" in column in B col.
For each unique truck number, the truck type should be same.(This is the expected result)
This can be achieved, by counting the maximum no. of truck types for the unique "truck no", and that cell to be updated with the Max. repeated "Truck type".
If there is equal no. of "Truck types" are available, It should be updated with the first available truck type.
Like this, there are thousands of rows to be updated. This can be
better understand by seeing the attached image.
I have attached the image & expected result is in the column C.
I have googled a lot, but I was unable to find the relevant solution.
Please help.
You do not say anything...
Please, test the next code. It works with assumption that the columns are sorted as we can see in the picture. It is very fast, since the result is put in an array and dropped on the sheet at once:
Sub findMaxCountVehType_Array()
Dim sh As Worksheet, lastRow As Long, rngVeh As Range, rngTemp As Range, arrFin As Variant
Dim i As Long, j As Long, w As Long, count As Long, maxCount As Long, ar As Long, maxStr As String
Set sh = ActiveSheet 'use here your sheet
lastRow = sh.Range("A" & Rows.count).End(xlUp).row
Set rngVeh = sh.Range("A2:C" & lastRow)
ReDim arrFin(1 To lastRow, 1 To 1)
arrFin(1, 1) = "Result": ar = 1
For i = 2 To lastRow
If sh.Range("A" & i).Value = sh.Range("A" & i + 1).Value Then
For j = i To j + 1000 'create a range of type cells for the same vehicle no
If sh.Range("A" & j).Value = sh.Range("A" & i).Value Then
If rngTemp Is Nothing Then
Set rngTemp = sh.Range("B" & j)
Else
Set rngTemp = Union(rngTemp, sh.Range("B" & j))
End If
Else
Exit For
End If
Next j
If rngTemp Is Nothing Then
ar = ar + 1: arrFin(ar, 1) = sh.Range("B" & i)
Else
For w = 1 To rngTemp.Cells.count 'determine the max occurrences string
count = WorksheetFunction.CountIf(rngTemp, rngTemp.Cells(w, 1).Value)
If count > maxCount Then maxCount = count: maxStr = rngTemp.Cells(w, 1).Value
Next
For w = 1 To rngTemp.Cells.count
ar = ar + 1: arrFin(ar, 1) = maxStr 'fill the max count in the array
Next
End If
Set rngTemp = Nothing: maxCount = 0: count = 0 'reinitialize variables
i = i + w - 2 'move the iteration to the following vehicle
Else
ar = ar + 1: arrFin(ar, 1) = sh.Range("B" & i)
End If
Next i
'drop the result array at once
sh.Range("C1").Resize(UBound(arrFin, 1), UBound(arrFin, 2)).Value = arrFin
End Sub
Here is a VBA routine that uses:
A class object which has
key:= Vehicle number
item:= dictionary of associated vehicle types
key:= vehicle type
item:= count of the vehicle types
After collecting the information, we merely need to cycle through the dictionary and extract, for any given vehicle ID, the vehicle type that has the largest count.
This routine, since it works entirely with VBA arrays, should run pretty fast, even with large amounts of data.
Also, with this method, no sorting is required.
ASSUMES the data starts in cell A1 (could be changed if necessary)
ASSUMES results are as you show with Desired Result in column C
Be sure to set a reference (Tools/References) to Microsoft Scripting Runtime
Class Module (rename this module cVehicle)
Option Explicit
Private pVehicleType As String
Private pVehicleTypes As Dictionary
Public Property Get VehicleType() As String
VehicleType = pVehicleType
End Property
Public Property Let VehicleType(Value As String)
pVehicleType = Value
End Property
Public Property Get VehicleTypes() As Dictionary
Set VehicleTypes = pVehicleTypes
End Property
Public Function addVehicleTypesItem(Value)
If pVehicleTypes.Exists(Value) Then
pVehicleTypes(Value) = pVehicleTypes(Value) + 1
Else
pVehicleTypes.Add Key:=Value, Item:=1
End If
End Function
Private Sub Class_Initialize()
Set pVehicleTypes = New Dictionary
pVehicleTypes.CompareMode = TextCompare
End Sub
Regular Module
'Set Reference to Microsoft Scripting Runtime
Option Explicit
Sub vehicle()
Dim dV As Dictionary, cV As cVehicle
Dim wsData As Worksheet, vData As Variant, rRes As Range
Dim V As Variant, I As Long, sKey As String, cKey As String, Cnt As Long
'set data worksheet
'read data into vba array
Set wsData = Worksheets("Sheet3")
With wsData
'add extra column for the "desired results"
vData = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=3)
Set rRes = .Cells(1, 1)
End With
'loop through the data and count the types
'no sorting necessary
Set dV = New Dictionary
For I = 2 To UBound(vData, 1)
Set cV = New cVehicle
With cV
sKey = vData(I, 1)
.VehicleType = vData(I, 2)
If Not dV.Exists(sKey) Then
.addVehicleTypesItem .VehicleType
dV.Add sKey, cV
Else
dV(sKey).addVehicleTypesItem .VehicleType
End If
End With
Next I
'Output the data
I = 1
'Header
vData(I, 3) = "Desired Result"
'Data
For I = 2 To UBound(vData, 1)
sKey = vData(I, 1)
With dV(sKey)
'which type has the highest count
Cnt = 0
For Each V In .VehicleTypes.Keys
If .VehicleTypes(V) > Cnt Then
Cnt = .VehicleTypes(V)
cKey = V
End If
Next V
vData(I, 3) = cKey
End With
Next I
'write the results
Set rRes = rRes.Resize(UBound(vData, 1), UBound(vData, 2))
rRes = vData
End Sub

Macro to Count Filter Distinct unique Value

I Have Table like this, where i have to use macro because my table always change Every day (SSAS)
so i have use macro to filter automatically,
I am able to sum Amount based on same Vendorname, PONuber and Date on Column E (Subtotal).
and then filter to show Subtotal AMount >500
I want to show only row >500 (Column E), and pop up message to count PONumber (Column B) how many Unique PO Number (Only Visible Row to count)
i've been stuck how to count only Visible Unique PO Number and show it on Pop Up message
this is my Macro
Sub FilterCOunt_Click()
Dim Condition As Variant
Dim AVal As Variant
Dim LastRow As Long
Dim Hide, popup As Long
Dim message As String
Dim sht As Worksheet
'----------------------------
Dim dictionary As Object
Set dictionary = CreateObject("scripting.dictionary")
'---------------------------
Application.ScreenUpdating = False
Application.EnableEvents = False
Application.AskToUpdateLinks = False
Application.DisplayAlerts = False
Application.Calculation = xlAutomatic
Application.ScreenUpdating = False
Application.StatusBar = False
'------------------
Columns.EntireColumn.Hidden = False
Rows.EntireRow.Hidden = False
Columns("E:Z").EntireColumn.Delete
Range("E:Z").EntireColumn.Insert
Range("E1").Value = "Sub Total >500 "
Set sht = ActiveSheet
LastRow = sht.Range("B" & Rows.Count).End(xlUp).Row
'-------------------
For i = 2 To LastRow ' with last row count =SUMIFS(I:I,A:A,A8,B:B,B8,C:C,C8)
AVal = "A" & i
BVal = "B" & i
CVal = "C" & i
Worksheets("Sheet3").Range("E" & i).Formula = "=SUMIFS(D:D,A:A," & AVal & ",B:B," & BVal & ",C:C," & CVal & ")"
Next i
With sht.Range("E1:E" & LastRow)
.AutoFilter
.AutoFilter field:=1, Criteria1:=">=500"
End With
'----------Count Pop UP
Dim CountPO As Long
Range("G1").FormulaArray = "=SUM(IF(FREQUENCY(IF(SUBTOTAL(3,OFFSET(B2,ROW(B2:B22)-ROW(B2),1)),IF(B2:B22<>"",MATCH(""&B2:B22,B2:B22&"",0))),ROW(B2:B22)- ROW(B2)+1),1))"
MsgBox "We Found " & CountPO & " PO Open(s)", _
vbInformation, "PO Found"
End Sub
and this is the formula to count it
{=SUM(IF(FREQUENCY(IF(SUBTOTAL(3,OFFSET(B2,ROW(B2:B22)-ROW(B2),1)),IF(B2:B22<>"",MATCH("~"&B2:B22,B2:B22&"",0))),ROW(B2:B22)-ROW(B2)+1),1))}
If you are pulling from a Database via SSAS you can use Power Query to link to your SSAS DataModel to Excel and you can insert a Calculated Measure in Dax from there with DistinctCount.
Count:=Calculate(DistinctCount(TableName[PONumber]),TableName[Amount]>500)
Alternatively if you want total insights on your specified issue you can add a measured column and then you can use Power Pivot to filter for your criteria live on refresh to the data model, completely negating the need for VBA entirely.
Incidentally it is pertinent to remember VBA is the sledge hammer of solutions please use the DataModel Tools before you ever think of a macro solution remember, VBA is an Application Programming Language and many IT Security Systems will disable it because it opens the system up for malware, you can literally change any file or program in VBA including calling delete system files
Meanwhile having a set DataModel in a locked file that requires user access behind LAN security is easily more secure than allowing your computer to have open programatic access.
This is an alternative formula (which doesn't require any filtering)
=SUM(--(FREQUENCY(IF(B2:B20>500,MATCH(A2:A20,A2:A20,0)),ROW(A2:A20)-ROW(A2)+1)>0))
It's an array formula so using VBA
Range("E1").FormulaArray = "=SUM(--(FREQUENCY(IF(B2:B20>500,MATCH(A2:A20,A2:A20,0)),ROW(A2:A20)-ROW(A2)+1)>0))"
A formula for your cell E2, which is not an array formula, is
=SUMPRODUCT((B2=B$2:B$23)*(A2=A$2:A$23)*(D$2:D$23))
Copy it down, as usual.
See here for why not using an array formula (if you have an alternative).
I am not certain this solves your question, as I did not fully understand it.
You can use the following code. I have implemented Collection to get the unique count.
This will count the unique rows in B column where value in E column > 500.
Private Sub GetUniqueCount() AS Variant
Dim Test As New Collection
Dim rng As Range
For i = 2 To 6 'Replace 6 with last row(without filtration)
Value = Cells(i, "B").Value
check = Contains(Test, Value)
If Cells(i, "E").Value > 500 And Not check And Len(Value) > 0 Then
Test.Add Value, CStr(Value)
End If
Next i
GetUniqueCount = Test.count
End Sub
'Function to check if the value exists in Collection or not
Public Function Contains(col As Collection, key As Variant) As Boolean
Dim obj As Variant
On Error GoTo err
Contains = True
obj = col(key)
Exit Function
err:
Contains = False
End Function
Step 1: Post my code to a new module.
Step 2: Bind you button to the macro named "filterAndCount"
Step 3: Click the buton and rejoice :-)
Code description:
1) The code loops all the rows in the table.
2) First it checks if the Sub Total is above the limit (500).
3) If equal or below it hides the row and moves on to the next row.
4) If above it checks if the value already exists in the array values above.
5) If it does not exists then the value is added to the array.
6) When all rows have been looped only rows with a Sub Total above the limit is visible.
7) Only the unique and visible PO Numbers have been added to the array.
8) The number of values in the array is dispayed in a message box.
Dim wb As Workbook
Dim ws As Worksheet
Dim i As Double
Dim n As Double
Dim subTotalLimit As Double
Dim arr() As String
Sub filterAndCount()
Set wb = ThisWorkbook
Set ws = wb.ActiveSheet
i = 2
subTotalLimit = 500
n = 0
ReDim arr(0 To 0) As String
arr(0) = 0
ws.Columns("E:Z").EntireColumn.Delete
ws.Range("E:Z").EntireColumn.Insert
ws.Range("E1").Value = "Sub Total >500 "
Do While ws.Range("B" & i) <> ""
ws.Range("E" & i).Formula = "=SUMIFS(D:D,A:A,A" & i & ",B:B,B" & i & ",C:C,C" & i & ")"
If ws.Range("E" & i) < subTotalLimit Then
ws.Range("B" & i).EntireRow.Hidden = True
Else
If Not IsNumeric(Application.Match(Range("B" & i).Text, arr(), 0)) Then
arr(n) = Range("B" & i).Value
n = UBound(arr) + 1
ReDim Preserve arr(0 To n) As String
arr(n) = 0
End If
End If
i = i + 1
Loop
MsgBox UBound(arr)
End Sub
Use 2 Dictionary Objects, one for totals and one for unique PO's
Sub filterCOunt()
Const LIMIT = 500
Dim wb As Workbook, ws As Worksheet
Dim iRow As Long, iLastRow As Long, amount As Single
Dim sVendor As String, sPO As String, msg As String, sKey As String
Dim dictPO As Object, dictTotal As Object
Set dictPO = CreateObject("Scripting.Dictionary")
Set dictTotal = CreateObject("Scripting.Dictionary")
Set wb = ThisWorkbook
Set ws = ActiveSheet
iLastRow = ws.Range("B" & Rows.Count).End(xlUp).Row
' first pass to total by po and vendor
For iRow = 2 To iLastRow
sVendor = Trim(ws.Cells(iRow, 1))
sPO = Trim(ws.Cells(iRow, 2))
amount = CSng(ws.Cells(iRow, 4))
sKey = sVendor & "_" & sPO
' sub total
If dictTotal.exists(sKey) Then
dictTotal(sKey) = dictTotal(sKey) + amount
Else
dictTotal.Add sKey, amount
End If
Next
' second pass for PO numbers
For iRow = 2 To iLastRow
sVendor = Trim(ws.Cells(iRow, 1))
sPO = Trim(ws.Cells(iRow, 2))
sKey = sVendor & "_" & sPO
' sub total
ws.Cells(iRow, 5) = dictTotal(sKey)
If dictTotal(sKey) > LIMIT Then
If Not dictPO.exists(sPO) Then
dictPO.Add sPO, iRow
End If
End If
Next
' filter
With ws.Range("E1:E" & iLastRow)
.AutoFilter
.AutoFilter field:=1, Criteria1:=">=" & LIMIT
End With
msg = "No of open PO's = " & dictPO.Count
MsgBox msg, vbInformation
End Sub
First, for your code Count Pop UP to work, let's change all from "" to """"
Second, to be able to notify a Unique PO Number and show it on Pop Up message, you must call the value received from cell G1, or, safer, use evaluate to get the result of this expression.
Your code will probably work now
'Dim CountPO As Long
Range("G1").FormulaArray = "=SUM(IF(FREQUENCY(IF(SUBTOTAL(3,OFFSET(B2,ROW(B2:B22)-ROW(B2),1)),IF(B2:B22<>"""",MATCH(""""&B2:B22,B2:B22&"""",0))),ROW(B2:B22)- ROW(B2)+1),1))"
MsgBox "We Found " & [g1].Value2 & " PO Open(s)", vbInformation, "PO Found"
however, your formula only counts all unique values including less than 500, in addition it is quite long. You can replace it using the shorter formula like the following code:
Dim formula_string As String
formula_string = "=SUMPRODUCT((B2:B22>3)*(C2:C22<>"""")/COUNTIF(B2:B22,B2:B22&""""))"
MsgBox "We Found " & Application.Evaluate(formula_string) & " PO Open(s)", vbInformation, "PO Found"
Hope it helps!

How to extract values of multiple listboxes on Excel sheet?

I have a userform with multiple checkboxes and listboxes, where each checkbox controls the values of one listbox each.
After clicking on 'Next' the userform inputs the selected values of each listbox on the Excel sheet. I am able to achieve this only for one pair of checkbox and listbox at a time. But I want the results of each shortlisted items one after the other.
Private Sub cmdFDB_Next_Click()
Dim ColCount As Integer, lastrow As Integer
Dim lastrow1 As Integer
Dim Data As Integer
Dim i As Integer
lastrow = Worksheets("Model Portfolio").Cells(Rows.Count, 2).End(xlUp).Row
With Worksheets("Model Portfolio").Cells(lastrow, 2)
.Offset(2, 0).Value = "Fixed Deposits and Bonds"
.Offset(2, 0).Font.Bold = True
.Offset(2, 0).Font.Size = 12
For i = 2 To lastrow
If Me.chkGB.Value = True Then
.Offset(3, 0).Value = "Government Bonds"
.Offset(3, 0).Font.Bold = True
.Offset(3, 2).Value = Format(Me.txtGB.Value, "Currency")
lastrow1 = lastrow + 4
ColCount = 2
With Me.lbxGB
'loop through each listbox item to see if they are selected
For Data = 0 To .ListCount - 1
If .Selected(Data) = True Then
Cells(lastrow1, ColCount).Value = .List(Data)
lastrow1 = lastrow1 + 1
End If
Next Data
End With
End If
If Me.chkCFD.Value = True Then
.Offset(3, 0).Value = "Corporate Fixed Deposits"
.Offset(3, 0).Font.Bold = True
.Offset(3, 2).Value = Format(Me.txtCFD.Value, "Currency")
lastrow1 = lastrow + 4
ColCount = 2
With Me.lbxCFD
'loop through each listbox item to see if they are selected
For Data = 0 To .ListCount - 1
If .Selected(Data) = True Then
Cells(lastrow1, ColCount).Value = .List(Data)
lastrow1 = lastrow1 + 1
End If
Next Data
End With
End If
If Me.chkTSB.Value = True Then
.Offset(3, 0).Value = "Tax Saving Bonds"
.Offset(3, 0).Font.Bold = True
.Offset(3, 2).Value = Format(Me.txtTSB.Value, "Currency")
lastrow1 = lastrow + 4
ColCount = 2
With Me.lbxTSB
'loop through each listbox item to see if they are selected
For Data = 0 To .ListCount - 1
If .Selected(Data) = True Then
Cells(lastrow1, ColCount).Value = .List(Data)
lastrow1 = lastrow1 + 1
End If
Next Data
End With
End If
Next i
End With
With MultiPage1
.Value = (.Value + 1) Mod (.Pages.Count)
End With
End Sub
Extract selected listbox items to sheet
As you aren't consequent in your row numbering (never changing lastrow mixed with additional offsets and increments), you are loosing track of the actual row numbers.
It's also better practice to use a Sub procedure (here: WriteItems) for repetitive calls and to redefine your lastrow (here: start row) each time. Furthermore I demonstrate how to extract a whole listbox "row" using the Application.Index() function.
Further hint: Instead of direct formatting, consider to use conditional formatting (CF) as you needn't clear old formats in deleted cells (sure you find a lot of examples at SO :-)
BTW I'd prefer to avoid control names containing an underscore "_" as this has some relevance in class implementations.
Main event
Private Sub cmdFDB_Next_Click()
'[0] Define data sheet
Const SHEETNAME As String = "Model Portfolio"
Dim ws As Worksheet
Set ws = ThisWorkbook.Worksheets(SHEETNAME)
'[1] Define abbreviations to identify securities controls
Dim Abbreviations, abbr
Abbreviations = Array("", "GB", "CFD", "TSB") ' first item is EMPTY!
'[2] write data for each security type
Dim OKAY As Boolean
For Each abbr In Abbreviations
'[2a] check
If abbr = vbNullString Then ' Main Title
OKAY = True
ElseIf Me.Controls("chk" & abbr) Then ' individual security checked
OKAY = True
Else
OKAY = False
End If
'==================================
'[2b] write selected data in blocks
'----------------------------------
If OKAY Then WriteItems abbr, ws ' call sub procedure
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Next
End Sub
Sub procedure WriteItems
Private Sub WriteItems(ByVal abbrev As String, ws As Worksheet)
'Purpose: write caption and selected listbox items to sheet
'Note: called by cmdFDB_Next_Click()
Const EMPTYROWS As Long = 1 ' << change to needed space
Const LBXPREFIX As String = "lbx" ' << change to individual checkbox prefix
Const TITLE As String = "Fixed Deposits and Bonds"
With ws
'[0] Define new startrow
Dim StartRow As Long
StartRow = .Cells(Rows.Count, 2).End(xlUp).Row + EMPTYROWS + 1
'[1] Write caption
ws.Cells(StartRow, 2) = getTitle(abbrev) ' function call, see below
If abbrev = vbNullString Then Exit Sub ' 1st array term writes main caption only
'other stuff (e.g. formatting of title above)
'...
'[2] Write data to worksheet
With Me.Controls(LBXPREFIX & abbrev)
Dim i As Long, ii As Long, temp As Variant
For i = 1 To .ListCount
If .Selected(i - 1) = True Then
ii = ii + 1
ws.Cells(StartRow + ii, .ColumnCount).Resize(1, 2).Value = Application.Index(.List, i, 0)
End If
Next i
End With
End With
End Sub
Further note: The Application.Index function allows to get a whole listbox "row" by passing zero (..,0) as second function argument.
Helper function GetTitle()
Function getTitle(ByVal abbrev As String) As String
'Purpose: return full name/caption of security abbreviation
Select Case UCase(abbrev)
Case vbNullString
getTitle = "Fixed Deposits and Bonds"
Case "GB": getTitle = "Government Bonds"
Case "CFD": getTitle = "Corporate Fixed Deposits"
Case "TSB": getTitle = "Tax Saving Bonds"
Case Else: getTitle = "All Other"
End Select
End Function

excel vba step thru rows faster

the code below works 100%. It scans for a match in Column B and copies and renames a group of cells when a match is found. However the is a line For lRow = Sheets("HR-Calc").Cells(Cells.Rows.count, "b").End(xlUp).Row To 7 Step -1
Where the step -1 will scan row by row from the bottom of the sheet until a match is found. It would be much easier if the step was set to End.(xlUp) instead of -1. searching every row is overkill because of how the data is set up End.(xlUp) would massive cut down the run time.
Is something like this possible?
Sub Fill_CB_Calc()
M_Start:
Application.ScreenUpdating = True
Sheets("summary").Activate
d_input = Application.InputBox("select first cell in data column", "Column Data Check", Default:="", Type:=8).Address(ReferenceStyle:=xlA1, RowAbsolute:=True, ColumnAbsolute:=False)
data_col = Left(d_input, InStr(2, d_input, "$") - 1)
data_row = Right(d_input, Len(d_input) - InStr(2, d_input, "$"))
Application.ScreenUpdating = False
Sheets("summary").Activate
Range(d_input).End(xlDown).Select
data_last = ActiveCell.Row
If IsEmpty(Range(data_col & data_row + 1)) = True Then
data_last = data_row
Else
End If
For j = data_row To data_last
CBtype = Sheets("summary").Range(data_col & j)
Sheets("HR-Calc").Activate
For lRow = Sheets("HR-Calc").Cells(Cells.Rows.count, "b").End(xlUp).Row To 7 Step -1
If Sheets("HR-Calc").Cells(lRow, "b") = CBtype Then
CBend = Sheets("HR-Calc").Range("C" & lRow).End(xlDown).Row + 1
Sheets("HR-Calc").Rows(lRow & ":" & CBend).Copy
CBstart = Sheets("HR-Calc").Range("c50000").End(xlUp).Row + 2
ActiveWindow.ScrollRow = CBstart - 8
Sheets("HR-Calc").Range("A" & CBstart).Insert Shift:=xlDown
CBold = Right(Range("c" & CBstart), Len(Range("C" & CBstart)) - 2)
box_name = Sheets("summary").Range(data_col & j).Offset(0, -10)
CBnew = Right(box_name, Len(box_name) - 2) & "-" ' <--this is custom and can be changed based on CB naming structure
If CBnew = "" Or vbCancel Then
End If
CBend2 = Range("c50000").End(xlUp).Row - 2
Range("C" & CBstart + 1 & ":" & "C" & CBend2).Select
Selection.Replace What:=CBold & "-", Replacement:=CBnew, LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Range("C" & CBstart).FormulaR1C1 = "CB" & Left(CBnew, Len(CBnew) - 1)
GoTo M_Start2
Else
End If
Next lRow
M_Start2:
Next j
YN_result = MsgBox("Fill info for another block/inverter?", vbYesNo + vbExclamation)
If YN_result = vbYes Then GoTo M_Start
If YN_result = vbNo Then GoTo jumpout
jumpout:
' Sheets("summary").Range(d_input).Select
Application.ScreenUpdating = True
End Sub
I'm not sure if this will help but I've had a great performance increase with pulling the entire range you need to loop through into a variant array and then looping through the array. If I need to loop through large data sets, this method has worked out well.
Dim varArray as Variant
varArray = Range(....) 'set varArray to the range you're looping through
For y = 1 to uBound(varArray,1) 'loops through rows of the array
'code for each row here
'to loop through individual columns in that row, throw in another loop
For x = 1 to uBound(varArray, 2) 'loop through columns of array
'code here
Next x
Next y
You can also define the column indexes prior to executing the loop. Then you only need to execute the you need to pull those directly in the loop.
'prior to executing the loop, define the column index of what you need to look at
Dim colRevenue as Integer
colRevenue = 5 'or a find function that searches for a header named "Revenue"
Dim varArray as Variant
varArray = Range(....) 'set varArray to the range you're looping through
For y = 1 to uBound(varArray,1) 'loops through rows of the array
tmpRevenue = CDbl(varArray(y, colRevenue))
Next y
Hope this helps.
Look at doing a .find from the bottom up.
Perform a FIND, within vba, from the bottom of a range up
That will eliminate the need to do the for loop from the last row to the first occurrence of the value you want to locate.

Optimize Excel VBA Code

I have the following VBA code within excel. It's goal is to remove a row if the given text is found, as well as remove the row directly below it. It needs to scan roughly 700k rows and is taking roughly an hour to do 100k rows. Does anyone see any optimization?
Sub RemovePageHeaders()
Application.ScreenUpdating = False
Dim objRange As Range
Set objRange = Cells.Find("HeaderText")
While objRange <> ""
objRange.Offset(1, 0).Rows(1).EntireRow.Delete
objRange.Rows(1).EntireRow.Delete
Set objRange = Cells.Find("HeaderText")
Wend
MsgBox ("I'm done removing page headers!")
End Sub
Thanks in advance!
Try the following sub. It loops from the bottomm-most row to the top, checking column 3 for "HeaderText". If that's found, it delete the row and the one below it. On a C2D E8500 with 2 gigs of RAM it takes just over a minute per 100,000 rows on a sheet with 1 million rows.
Sub RemoveHeaders()
Dim i As Long
Application.ScreenUpdating = False
Debug.Print "Started: " & Now
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
If ActiveSheet.Cells(i, 3) = "HeaderText" Then
ActiveSheet.Range(i & ":" & i + 1).EntireRow.Delete
End If
Next i
Application.ScreenUpdating = True
Debug.Print "Finished: " & Now
End Sub
EDIT
For a slightly ghetto but possibly much faster solution try this:
Change the constant in the below code to the number of the first column that's blank in every row. For example if your data takes up columns A-F, you want the constant to be 7 (column G).
Run the code, it will put the row number next to every entry. Should take around 30 seconds.
Sort the ENTIRE data by column C; this should take less than a minute.
Find "HeaderText" visually, select and delete all the rows.
Sort by your row-numbered column ("G" in my example).
Delete the row-numbered column (again, "G" in my example).
Sub NumberColumns()
Const BLANK_COLUMN = 7
Dim i As Long
For i = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
ActiveSheet.Cells(i, BLANK_COLUMN) = i
Next i
Debug.Print "done"
End Sub
Even if it doesn't fully answer the question, it may help any reader so...
There are several tips on the web about optimizing vba. In particular, you can do:
'turn off some Excel functionality so your code runs faster
'these two are especially very efficient
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'use these if you really need to
Application.DisplayStatusBar = False
Application.EnableEvents = False 'is very efficient if you have ANY event associated with what your macro is going to do
'code goes here
'at the end, don't forget to restore the default behavior
'calculate the formulas
Application.Calculate
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
Application.DisplayStatusBar = True
Application.EnableEvents = True
See here for more information
Putting this entry in a little late. It should be about 2X faster than the accepted solution. I used my XP Excel 2003 computer with 1 gig to figure it out.
Sub DeleteHeaderText()
Dim bUnion As Boolean
Dim d1 As Double
Dim l As Long
Dim rDelete As Range
Dim wks As Worksheet
Dim vData As Variant
d1 = Timer
Application.EnableEvents = False
Application.ScreenUpdating = False
bUnion = False
Set wks = ActiveSheet
lEnd = ActiveSheet.UsedRange.Rows.Count
vData = wks.Range("C1:C" & lEnd).Value2
For l = 1 To lEnd
If vData(l, 1) = "HeaderText" Then
If bUnion Then
Set rDelete = Union(rDelete, wks.Range("A" & l, "A" & l + 1))
Else
Set rDelete = wks.Range("A" & l, "A" & l + 1)
bUnion = True
End If
l = l + 1
End If
Next l
Debug.Print Timer() - d1
rDelete.EntireRow.Delete
Debug.Print Timer() - d1
End Sub
I know this is late, but if I understand your problem, then you are deleting rows based on a "HeaderText" in column C. So, since i didn't look at your data, i created my own. I created 700,000 rows and every 9th row contained the "HeaderText" string. It deleted ~233k rows ("HeaderText" row + row before + row after) and ran in 2.2 seconds on my computer. Give it a try!!
Private Declare Function timeGetTime Lib "winmm.dll" () As Long
Sub DeleteHeaders()
Dim LastRow As Long
Dim I As Long
Dim WkSheet As Excel.Worksheet
Dim VArray As Variant
Dim NewArray() As String
Dim BooleanArray() As Boolean
Dim NewArrayCount As Long
Dim J As Long
Dim T As Double
Dim DeleteRowCount As Long
T = timeGetTime
With Application
.ScreenUpdating = False
.EnableEvents = False
.Calculation = xlCalculationManual
End With
Set WkSheet = ThisWorkbook.Sheets("Sheet1")
With WkSheet.UsedRange
LastRow = .Rows.Count
VArray = .Value
End With
ReDim BooleanArray(0 To UBound(VArray, 1) - 1), NewArray(UBound(VArray, 1) - 1, 0 To UBound(VArray, 2))
For I = 1 To UBound(VArray, 1)
If InStrB(1, VArray(I, 3), "HeaderText", vbBinaryCompare) <> 0 Then
BooleanArray(I - 1) = Not BooleanArray(I - 1)
BooleanArray(I) = Not BooleanArray(I)
BooleanArray(I + 1) = Not BooleanArray(I + 1)
End If
Next I
For I = LBound(BooleanArray, 1) To UBound(BooleanArray, 1)
If BooleanArray(I) = False Then
For J = LBound(VArray, 2) To UBound(VArray, 2)
NewArray(NewArrayCount, J - 1) = VArray(I + 1, J)
Next J
NewArrayCount = NewArrayCount + 1
Else
DeleteRowCount = DeleteRowCount + 1
End If
Next I
With WkSheet
.Cells.Delete
.Range("a1:c" & NewArrayCount).Value = NewArray
End With
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
End With
Erase NewArray, BooleanArray, VArray
MsgBox "Deleted " & DeleteRowCount & " rows." & vbNewLine & vbNewLine & _
"Run time: " & Round((timeGetTime - T) / 1000, 3) & " seconds.", vbOKOnly, "RunTime"
End Sub
Here's a solution that will run on 100k rows in about 5-20 seconds depending on how many occurances of 'HeaderText' you have. As you requested, it will delete both the row with HeaderText in the C column as well as the row directly above it.
Update:
As it's been pointed out, this works on smaller data sets up to about 100k, but on larger sets it's really doesn't. Back to the drawing board :)
Sub DeleteHeaders()
Application.ScreenUpdating = False
Dim lastRow As Long
Dim varray As Variant
lastRow = Range("C" & Rows.Count).End(xlUp).Row
On Error Resume Next
varray = Range("C1:C" & lastRow).Value
For i = UBound(varray, 1) To 1 Step -1
If varray(i, 1) = "HeaderText" Then
Range("C" & i - 1, Range("C" & i)).EntireRow.Delete
i = i - 1
End If
Next
Application.ScreenUpdating = True
End Sub
How it works:
By dumping the entire C column into a variant array and working from it within excel, you get major speed increase. The varray is laid out like (1, 1), (2, 1), (3, 1) with the first number being the row number, so all you have to do is loop through it backwards. The key is making sure to delete both rows at the same time and decrementing i by one more.
The following is code lifted from a Bill Jelen book that is fantastic for this purpose.
Use a column (column A for my code) with some logic to determine if a row should be hidden on not.
Use the following formula in all applicable cells in that column
=IF(test TRUE to hide, 1, "keep")
Now use the VBA below
Range("A1:A10000").SpecialCells(xlCellTypeFormulas, xlNumbers).EntireRow.Delete
This selects all rows with a number returned by the formula at once, which is exactly the rows you want to delete. No looping required!
Here on my blog have a scripts for this:
Sample One:
Sub DelBlankRows()
Range("D1:D" & Cells _
(Rows.Count,2).End(xlUp).Row).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
End Sub
Sample two:
Sub DeleteRowsWithSpecifiedData()
'Looks in Column D and requires Column IV to be clean
Columns(4).EntireColumn.Insert
With Range("D1:D" & ActiveSheet.UsedRange.Rows.Count)
.FormulaR1C1 = "=IF(RC[1]="""","""",IF(RC[1]=""Not Needed"",NA()))"
.Value = .Value
On Error Resume Next
.SpecialCells(xlCellTypeConstants, xlErrors).EntireRow.Delete
End With
On Error GoTo 0
Columns(4).EntireColumn.Delete
End Sub

Resources