Get top five students with highest marks - excel

I have searched and found a code that extracts the top five names with the highest marks. The code is OK and I can get the names and marks
Sub Test_GetTopFive()
GetTopFive Range("A2:B" & Cells(Rows.Count, 1).End(xlUp).Row)
End Sub
Sub GetTopFive(r As Range)
Dim v, t, i As Long
t = Application.WorksheetFunction.Aggregate(14, 6, r.Columns(2), 5)
v = r
For i = 1 To UBound(v, 1)
If Not IsError(v(i, 1)) Then
If v(i, 2) >= t Then
Debug.Print v(i, 1), v(i, 2)
End If
End If
Next i
End Sub
But the results in the immediate window are not sorted. I need to get the names with the highest marks first.

Have a go with the code below. The tricky part is you can't just sort arrays, So I instead have it loop the number of results you want, then for each of those it loops through the array to find the max value. Once found it prints it, then sets it's value to 0 to remove it from being looked at in the next result.
Sub Test_GetTopFive()
GetTopFive Range("A2:B" & Cells(Rows.Count, 1).End(xlUp).Row)
End Sub
Sub GetTopFive(r As Range)
Dim v, t, m, i As Long, j As Long, rw As Long
t = Application.WorksheetFunction.Aggregate(14, 6, r.Columns(2), 5)
m = t - 1
v = r
For i = 1 To 5
For j = 1 To UBound(v, 1)
If Not IsError(v(j, 2)) Then
If v(j, 2) >= t Then
If v(j, 2) > m Then
m = v(j, 2)
rw = j
End If
End If
End If
Next j
If rw > 0 Then
Debug.Print v(rw, 1), v(rw, 2)
v(rw, 2) = 0
m = t - 1
rw = 0
End If
Next i
End Sub

I don't understand why you are using VBA for that: in order to get the five larges values (e.g. from range A2:A10), I just type those five formulas (e.g. in range "C1:C5"):
=LARGE(A$2:A$10,1) 'in cell C1, there you get the largest value.
=LARGE(A$2:A$10,2) 'in cell C2, there you get the second largest value.
=LARGE(A$2:A$10,3) 'in cell C3, there you get the third largest value.
=LARGE(A$2:A$10,4) 'in cell C4, there you get the fourth largest value.
=LARGE(A$2:A$10,5) 'in cell C5, there you get the fifth largest value.

VBA Top Values
Some Issues
In this case, the WorksheetFunction.Aggregate function will raise an error e.g. if there are less than 5 numeric values. What to do in such a case?
How to resolve the ties? Pick the first appearing in the range?
Application.Max will return an error if there are error values.
Application.Max will not consider blanks as zeros.
Application.Match will return an error if there was no match.
What if there are negative numbers (ridiculous in this case)?
Option Explicit
Sub Test_DebugPrintTop()
Dim rg As Range: Set rg = Range("A2:B" & Cells(Rows.Count, 1).End(xlUp).Row)
DebugPrintTop rg, 5, False
End Sub
Sub DebugPrintTop( _
ByVal rg As Range, _
ByVal TopCount As Long, _
Optional ByVal IncludeBlanks As Boolean = False)
If rg Is Nothing Then
Debug.Print "No range."
Exit Sub
End If
If TopCount < 1 Then
Debug.Print "'TopCount' has to be a positive integer."
Exit Sub
End If
Dim sData As Variant: sData = rg.Resize(, 2).Value ' only 2 columns
Dim sData2 As Variant: sData2 = rg.Columns(2).Value ' 2nd column
Dim srCount As Long: srCount = UBound(sData, 1)
Dim r As Long
Dim srValue As Variant: srValue = Application.Max(sData2)
If IsError(srValue) Then
For r = 1 To srCount
' Check for error values and replace them with 'Empty' values.
If IsError(sData2(r, 1)) Then
sData(r, 2) = Empty
sData2(r, 1) = Empty
End If
Next r
End If
If IncludeBlanks Then
For r = 1 To srCount
' Check for blanks and replace them with zeros.
If Len(sData2(r, 1)) = 0 Then
sData(r, 2) = 0
sData2(r, 1) = 0
End If
Next r
End If
Dim srIndexes() As Long
Dim srIndex As Variant
Dim drCount As Long
For r = 1 To TopCount
srValue = Application.Max(sData2)
srIndex = Application.Match(srValue, sData2, 0)
If IsNumeric(srIndex) Then
drCount = drCount + 1
ReDim Preserve srIndexes(1 To drCount)
srIndexes(drCount) = srIndex
sData2(srIndex, 1) = Empty ' not 0
Else
Exit For
End If
Next r
If drCount = 0 Then
Debug.Print "No numbers."
Exit Sub
End If
For r = 1 To drCount
Debug.Print sData(srIndexes(r), 1), sData(srIndexes(r), 2)
Next r
' An idea to make e.g. the 'GetTop' function from it.
' Dim dData As Variant: ReDim dData(1 To drCount, 1 To 2)
' For r = 1 To drCount
' dData(r, 1) = sData(srIndexes(r), 1)
' dData(r, 2) = sData(srIndexes(r), 2)
' Next r
' GetTop = dData
End Sub

Related

How to skip "nan" or blanks in my vba code

I have an excel sheet with bunch of "nan" and blank cells and i don't want my for loop to evaluate these as shown below in the image
![image][1]
This is my code below and it's taking all the blank cell values as 0 and messing with my code.
For j = LBound(ARINC_CHA_DATA_array, 2) To UBound(ARINC_CHA_DATA_array, 2)
For i = LBound(ARINC_CHA_DATA_array, 1) To UBound(ARINC_CHA_DATA_array, 1)
If ARINC_CHA_DATA_array(i, j) <> "" Or ARINC_CHA_DATA_array(i, j) = "nan" Then GoTo Ravi
...........code
...........code
End If
Ravi:
Next i
Next j
```[![enter image description here][2]][2]
[1]: https://i.stack.imgur.com/ip2B9.png
[2]: https://i.stack.imgur.com/bTDou.png
Select Case vs If...Then...Else When Looping Through an Array
Use shorter variable names to make the code more readable.
GoTo is kind of reserved for error-handling and its use can most often be avoided.
The opposite of A = B OR C = D is A <> B AND C <> D.
That is why Test2 is preferred compared to Test3 (no need for the Else statement).
All three codes do the same: they read the values from a range into an array, then loop through the elements of the array doing ... code (at the moment nothing) and write the 'modified' values from the array back to the range.
Caution: If there are formulas in the range, after running this code, they will be replaced with values.
Option Explicit
Sub Test1()
Dim rg As Range: Set rg = ActiveSheet.Range("A1:J10")
Dim Acd As Variant: Acd = rg.Value
Dim Item As Variant
Dim j As Long, i As Long
For j = LBound(Acd, 2) To UBound(Acd, 2)
For i = LBound(Acd, 1) To UBound(Acd, 1)
Item = Acd(i, j)
Select Case CStr(Item)
Case "", "nan"
' do nothing
Case Else
' ... code
End Select
Next i
Next j
rg.Value = Acd
End Sub
Sub Test2()
Dim rg As Range: Set rg = ActiveSheet.Range("A1:J10")
Dim Acd As Variant: Acd = rg.Value
Dim Item As Variant
Dim j As Long, i As Long
For j = LBound(Acd, 2) To UBound(Acd, 2)
For i = LBound(Acd, 1) To UBound(Acd, 1)
Item = Acd(i, j)
If CStr(Item) <> "" And CStr(Item) <> "nan" Then
' ... code
'Else ' you don't need this
' do nothing
End If
Next i
Next j
rg.Value = Acd
End Sub
Sub Test3()
Dim rg As Range: Set rg = ActiveSheet.Range("A1:J10")
Dim Acd As Variant: Acd = rg.Value
Dim Item As Variant
Dim j As Long, i As Long
For j = LBound(Acd, 2) To UBound(Acd, 2)
For i = LBound(Acd, 1) To UBound(Acd, 1)
Item = Acd(i, j)
If CStr(Item) = "" Or CStr(Item) = "nan" Then
' do nothing
Else
' ... code
End If
Next i
Next j
rg.Value = Acd
End Sub

Getting the maximum value of a specific column in a 2d array [duplicate]

I use the code hereunder to calculate max values as described in this post (vba max value of group of values). The code works great but once I have more than 65k lines I get a data type mismatch when trying to pase the array:
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
Could somebody help me to slice the array in chunks. I have tried to get it working myself but without any luck.
Sub FillGroupsMax()
Dim lColumn As Long
Dim sht As Worksheet
Dim groupsArray As Variant 'array with all group infomation
Dim groupsSeen As Variant 'array with group infomation already seen
Application.ScreenUpdating = False 'stop screen updating makes vba perform better
Set sht = ThisWorkbook.Worksheets("import")
Set last = sht.Range("A:A").Find("*", Cells(1, 1), searchdirection:=xlPrevious) 'last cell with value in column A
lColumn = sht.Cells(1, Columns.Count).End(xlToLeft).Column
groupsArray = sht.Range(Cells(1, 1), Cells(last.Row, lColumn))
'collect all the information on the Sheet into an array
'Improves performance by not visiting the sheet
For dRow = 2 To last.Row 'for each of the rows skipping header
'check if group as already been seen
If inArrayValue(Cells(dRow, 1).Value, groupsSeen) > 0 Then
'if it has been seen/calculated attribute value
'Cells(dRow, 4).Value = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
groupsArray(dRow, lColumn) = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
Else
'if it hasn't been seen then find max
'Cells(dRow, 4).Value = getMax(Cells(dRow, 1).Value, groupsArray)
groupsArray(dRow, lColumn) = getMax(Cells(dRow, 1).Value, groupsArray, lColumn)
'array construction from empty
If IsEmpty(groupsSeen) Then
ReDim groupsSeen(0)
'groupsSeen(0) = Array(Cells(dRow, 1).Value, Cells(dRow, 4).Value)
groupsSeen(0) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
'attribute value to array
Else
ReDim Preserve groupsSeen(0 To UBound(groupsSeen) + 1)
groupsSeen(UBound(groupsSeen)) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
End If
End If
Next
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
'reactivate Screen updating
Application.ScreenUpdating = True
End Sub
Function getMax(group As String, groupsArray As Variant, lColumn As Long) As Double
'for each in array
For n = 1 To UBound(groupsArray)
'if its the same group the Max we seen so far the record
If groupsArray(n, 1) = group And groupsArray(n, lColumn - 1) > maxSoFar Then
maxSoFar = groupsArray(n, lColumn - 1)
End If
Next
'set function value
getMax = maxSoFar
End Function
Function inArrayValue(group As String, groupsSeen As Variant) As Double
'set function value
inArrayValue = 0
'if array is empty then exit
If IsEmpty(groupsSeen) Then Exit Function
'for each in array
For n = 0 To UBound(groupsSeen)
'if we find the group
If groupsSeen(n)(0) = group Then
'set function value to the Max value already seen
inArrayValue = groupsSeen(n)(1)
'exit function earlier
Exit Function
End If
Next
End Function
You can write a helper function to use instead of Application.Index
Bonus - it will be much faster than using Index (>5x)
Sub Tester()
Dim arr, arrCol
arr = Range("A2:J80000").Value
arrCol = GetColumn(arr, 5) '<< get the fifth column
Range("L2").Resize(UBound(arrCol, 1), 1).Value = arrCol
End Sub
'extract a single column from a 1-based 2-D array
Function GetColumn(arr, colNumber)
Dim arrRet, i As Long
ReDim arrRet(1 To UBound(arr, 1), 1 To 1)
For i = 1 To UBound(arr, 1)
arrRet(i, 1) = arr(i, colNumber)
Next i
GetColumn = arrRet
End Function
EDIT - since QHarr asked about timing here's a basic example
Sub Tester()
Dim arr, arrCol, t, i as long
arr = Range("A2:J80000").Value
t = Timer
For i = 1 to 100
arrCol = GetColumn(arr, 5) '<< get the fifth column
Next i
Debug.print Timer - t '<<# of seconds for execution
End Sub
Below, whilst not as tidy as could be, is a way to process an array in chunks and Index to access a column and write out to the sheet.
I populated two columns (A:B) with data. Both had 132,000 rows, populated incrementally, with values from 1 to 132,000 in each column for my test run.
You can fiddle with cutOff to get the chunk size just below the point where the fail happens.
The code below is simply to demonstrate the principle of looping in batches, upto the set cutoff in each batch, until all rows have been processed.
Option Explicit
Public Sub WriteArrayToSheet()
Dim wb As Workbook
Dim ws As Worksheet
Set wb = ThisWorkbook
Set sht = wb.Worksheets("Sheet1") 'change as appropriate
Dim myArr() 'dynamic array
myArr = sht.Range("A1").CurrentRegion.Value 'you may want a more robust method
Dim cutOff As Long 'the max value - what ever it is before error occurs
cutOff = 1000
Dim totalRows As Long 'total rows in array read in from sheet
totalRows = UBound(myArr, 1)
Dim totalArraysNeeded As Long
'Determine how many lots of cutOff chunks there are in the total number of array rows
totalArraysNeeded = Application.WorksheetFunction.Ceiling(totalRows / cutOff, 1)
Dim rotations As Long 'number of times to loop original array to handle all rows
Dim rowCountTotal As Long
Dim rowCount As Long
Dim tempArr() 'this will hold the chunk of the original array
Dim rowCounter As Long
Dim lastRow As Long
Dim nextRow As Long
Dim i As Long
Dim j As Long
Dim numRows As Long
rotations = 1
Do While rotations < totalArraysNeeded
If rotations < totalArraysNeeded - 1 Then
ReDim tempArr(1 To cutOff, 1 To UBound(myArr, 2)) 'size chunk array
numRows = cutOff
Else
numRows = totalRows - rowCountTotal
ReDim tempArr(1 To numRows, 1 To UBound(myArr, 2)) 'size chunk array
End If
For i = 1 To numRows
rowCount = 1 'rows in this chunk looped
rowCountTotal = rowCountTotal + 1 'rows in original array looped
For j = LBound(myArr, 2) To UBound(myArr, 2)
tempArr(i, j) = myArr(rowCountTotal, j)
Next j
rowCount = rowCount + 1
Next i
With sht
lastRow = .Cells(.Rows.Count, "E").End(xlUp).Row 'Column where I am writing the sliced column out to
End With
If lastRow = 1 Then
nextRow = 1
Else
nextRow = lastRow + 1
End If
sht.Range("E" & nextRow).Resize(UBound(tempArr, 1), 1) = Application.Index(tempArr, , 1) 'write out to sheet
rotations = rotations + 1
Loop
End Sub
As #Tim suggested, the best way to slice a large array is use a loop to copy the column.
Though in your case, most of the processing time is spent on computing the maximum since your code is using a nested loop.
If you want to reduce significantly the processing time, then use a dictionary:
Sub Usage
GetMaxByGroupTo _
sourceGroups := ThisWorkbook.Range("Sheet1!A2:A100"), _
sourceValues := ThisWorkbook.Range("Sheet1!B2:B100"), _
target := ThisWorkbook.Range("Sheet1!C2")
End Sub
Sub GetMaxByGroupTo(sourceGroups As Range, sourceValues As Range, target As Range)
Dim dict As Object, groups(), values(), r As Long, max
Set dict = CreateObject("Scripting.Dictionary")
groups = sourceGroups.Value2
values = sourceValues.Value2
' store the maximum value of each group in a dictionary for an efficient lookup '
For r = Lbound(groups) to Ubound(groups)
max = dict(groups(r, 1))
If VarType(max) And values(r, 1) <= max Then Else dict(groups(r, 1)) = values(r, 1)
Next
' build and copy the result array to the sheet '
For r = Lbound(groups) to Ubound(groups)
values(r, 1) = dict(groups(r, 1))
Next
target.Resize(Ubound(groups), 1).Value2 = values
End Sub

Setting cell equal to random value if cell isn't blank in range

At a high level I am trying to set a cell equal to a random cell within a range. The issue I am having is that in this range I want to pull a random Value from, the Value I am taking is the result of an 'if' expression that either sets the cell to a Value or "". So when I chose the random value I only want to choose cells that have an actual value, not the "".
Does anyone know how to get this expected behavior?
The code below shows what I have tried currently, each large block is commented to help with understanding. The block I need help with replaces the values in each column until the next cell is blank then moves to the next column.
upperBound = 1798
lowerBound = 2
Randomize
'This loop section populates the data area with a static value in cell 9,3 then 9,4 etc..
For j = 3 To 15
val = Cells(9, j).Value
For i = 1 To val
Cells(12 + i, j).Value = Cells(9, j)
Next i
Next j
'This loop section uses the cells already populated down each column and replaces that value with the random value from the other range
Dim x As Integer
' Set numrows = number of rows of data.
For j = 3 To 15
NumRows = Range(Cells(13, j), Cells(13, j).End(xlDown)).Rows.Count
' Select cell 13,j.
Cells(13, j).Select
' Establish "For" loop to loop "numrows" number of times.
For x = 1 To NumRows
ActiveCell.Value = Worksheets("2017 Role IDs").Cells(Int((upperBound - lowerBound + 1) * Rnd + lowerBound), 2).Value
' Selects cell down 1 row from active cell.
ActiveCell.Offset(1, 0).Select
Next
Next j
This is the data before the second block runs. I want to replace the values that just match the number in the second row with the random number in the range:
This is what I would like to look like:
But currently it looks like this because the random selector is taking blank values:
Something like this should work for you:
Sub tgr()
Dim wb As Workbook
Dim wsNums As Worksheet
Dim wsDest As Worksheet
Dim aData As Variant
Dim vData As Variant
Dim aNums() As Double
Dim aResults() As Variant
Dim lNumCount As Long
Dim lMaxRows As Long
Dim lRowCount As Long
Dim ixNum As Long
Dim ixResult As Long
Dim ixCol As Long
Set wb = ActiveWorkbook
Set wsNums = wb.Worksheets("2017 Role IDs")
Set wsDest = wb.ActiveSheet
With wsNums.Range("B2", wsNums.Cells(wsNums.Rows.Count, "B").End(xlUp))
If .Row < 2 Then Exit Sub 'No data
lNumCount = WorksheetFunction.Count(.Cells)
If lNumCount = 0 Then Exit Sub 'No numbers
ReDim aNums(1 To lNumCount)
If .Cells.Count = 1 Then
ReDim aData(1 To 1, 1 To 1)
aData(1, 1) = .Value
Else
aData = .Value
End If
'Load populated numeric cells into the aNums array
For Each vData In aData
If Len(vData) > 0 And IsNumeric(vData) Then
ixNum = ixNum + 1
aNums(ixNum) = vData
End If
Next vData
End With
lMaxRows = Application.Max(wsDest.Range("C9:O9"))
If lMaxRows = 0 Then Exit Sub 'Row count not populated in row 9 for each column
ReDim aResults(1 To WorksheetFunction.Max(wsDest.Range("C9:O9")), 1 To 13)
'Populate each column accordingly and pull a random number from aNums
For ixCol = 1 To UBound(aResults, 2)
If IsNumeric(wsDest.Cells(9, ixCol + 2).Value) Then
For ixResult = 1 To CLng(wsDest.Cells(9, ixCol + 2).Value)
Randomize
aResults(ixResult, ixCol) = aNums(Int(Rnd() * lNumCount) + 1)
Next ixResult
End If
Next ixCol
wsDest.Range("C13").Resize(UBound(aResults, 1), UBound(aResults, 2)).Value = aResults
End Sub

how to adjust code for better performance

I am trying to make edge relation from excel file which are organized in rows,
A,B,C,
D,E
the aim is to create relationships from each row:
A,B
A,C
B,C
I have the following codes , the problem is the codes is efficient when rows are equal in length but for example for above rows it create also following edges (relationship):
D," "
E, " "
Which create big problem for large data set. I was wondering if some body can help me to adjust the code the way to create the edge list only till filled cells in each row. If there is any other way to do this more efficient will appreciate it.
Thank you so much,Will be great help.
My code:
Sub Transform()
Dim targetRowNumber As Long
targetRowNumber = Selection.Rows(Selection.Rows.Count).Row + 2
Dim col1 As Variant
Dim cell As Range
Dim colCounter As Long
Dim colCounter2 As Long
Dim sourceRow As Range: For Each sourceRow In Selection.Rows
For colCounter = 1 To Selection.Columns.Count - 1
col1 = sourceRow.Cells(colCounter).Value
For colCounter2 = colCounter + 1 To Selection.Columns.Count
Set cell = sourceRow.Cells(, colCounter2)
If Not cell.Column = Selection.Column Then
Selection.Worksheet.Cells(targetRowNumber, 1) = col1
Selection.Worksheet.Cells(targetRowNumber, 2) = cell.Value
targetRowNumber = targetRowNumber + 1
End If
Next colCounter2
Next colCounter
Next sourceRow
End Sub
I've played around with it - this should do the trick. We can probably speed it up by outputting to another variant array if needed, but this ran pretty quickly for me:
Sub Transform_New()
Dim rngSource As Range, rngDest As Range
Dim varArray As Variant
Dim i As Integer, j As Integer, k As Integer
Set rngSource = Sheet1.Range("A1", Sheet1.Cells(WorksheetFunction.CountA(Sheet1.Columns(1)), 1)) 'Put all used rows into range
Set rngDest = Sheet1.Cells(WorksheetFunction.CountA(Sheet1.Columns(1)), 1).Offset(2, 0) 'Set target range to start 2 below source range
varArray = Range(rngSource, rngSource.Offset(0, Range("A1").SpecialCells(xlCellTypeLastCell).Column)).Value
For i = LBound(varArray, 1) To UBound(varArray, 1) 'Loop vertically through array
For j = LBound(varArray, 2) To UBound(varArray, 2) 'Loop horizontally through each line apart from last cell
k = j
Do Until varArray(i, k) = ""
k = k + 1
If varArray(i, k) <> "" Then
rngDest.Value = varArray(i, j)
rngDest.Offset(0, 1).Value = varArray(i, k)
Set rngDest = rngDest.Offset(1, 0)
End If
Loop
Next
Next
End Sub

Remove duplicates from array using VBA

Assume I have a block of data in Excel 2010, 100 rows by 3 columns.
Column C contains some duplicates, say it starts off as
1, 1, 1, 2, 3, 4, 5, ..... , 97, 98
Using VBA, I would like to remove the duplicate rows so I am left with 98 rows and 3 columns.
1, 2, 3, ..... , 97, 98
I know there is a button in Excel 2010 to do that but it inteferes with the rest of my code subsequently and gives incorrect results.
Furthermore, I would like to do it in arrays, then paste the results on the worksheet, rather than methods such as Application.Worksheetfunction.countif(.....
So something like:
Dim myarray() as Variant
myarray=cells(1,1).Currentregion.value
Dim a as Long
For a=1 to Ubound(myarray,1)
'something here to
Next a
I answered a similar question. Here is the code I used:
Dim dict As Object
Dim rowCount As Long
Dim strVal As String
Set dict = CreateObject("Scripting.Dictionary")
rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count
'you can change the loop condition to iterate through the array rows instead
Do While rowCount > 1
strVal = Sheet1.Cells(rowCount, 1).Value2
If dict.exists(strVal) Then
Sheet1.Rows(rowCount).EntireRow.Delete
Else
'if doing this with an array, then add code in the Else block
' to assign values from this row to the array of unique values
dict.Add strVal, 0
End If
rowCount = rowCount - 1
Loop
Set dict = Nothing
If you want to use an array, then loop through the elements with the same conditional (if/else) statements. If the item doesn't exist in the dictionary, then you can add it to the dictionary and add the row values to another array.
Honestly, I think the most efficient way is to adapt code you'd get from the macro recorder. You can perform the above function in one line:
Sheet1.UsedRange.RemoveDuplicates Columns:=3, Header:=xlYes
Function eliminateDuplicate(poArr As Variant) As Variant
Dim poArrNoDup()
dupArrIndex = -1
For i = LBound(poArr) To UBound(poArr)
dupBool = False
For j = LBound(poArr) To i
If poArr(i) = poArr(j) And Not i = j Then
dupBool = True
End If
Next j
If dupBool = False Then
dupArrIndex = dupArrIndex + 1
ReDim Preserve poArrNoDup(dupArrIndex)
poArrNoDup(dupArrIndex) = poArr(i)
End If
Next i
eliminateDuplicate = poArrNoDup
End Function
Simple function to remove duplicates from a 1D array
Private Function DeDupeArray(vArray As Variant) As Variant
Dim oDict As Object, i As Long
Set oDict = CreateObject("Scripting.Dictionary")
For i = LBound(vArray) To UBound(vArray)
oDict(vArray(i)) = True
Next
DeDupeArray = oDict.keys()
End Function
Edit:
With stdVBA (a library largely maintained by myself) you can use:
uniqueValues = stdEnumerator.CreateFromArray(myArray).Unique().AsArray()
An improvement on #RBILLC and #radoslav006 answers, this version searches the array with the duplicates removed for existing values so it searchs less values to find a duplicate.
Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
Dim duplicateFound As Boolean
Dim arrayIndex As Integer, i As Integer, j As Integer
Dim deduplicatedArray() As Variant
arrayIndex = -1
deduplicatedArray = Array(1)
For i = LBound(sourceArray) To UBound(sourceArray)
duplicateFound = False
For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
If sourceArray(i) = deduplicatedArray(j) Then
duplicateFound = True
Exit For
End If
Next j
If duplicateFound = False Then
arrayIndex = arrayIndex + 1
ReDim Preserve deduplicatedArray(arrayIndex)
deduplicatedArray(arrayIndex) = sourceArray(i)
End If
Next i
RemoveDuplicatesFromArray = deduplicatedArray
End Function
Here's another approach for working with an array:
Sub tester()
Dim arr, arrout
arr = Range("A1").CurrentRegion.Value 'collect the input array
arrout = UniqueRows(arr) 'get only unique rows
Range("H1").Resize(UBound(arrout, 1), UBound(arrout, 2)).Value = arrout
End Sub
Function UniqueRows(arrIn As Variant) As Variant
Dim keys, rw As Long, col As Long, k, sep, arrout
Dim dict As Object, lbr As Long, lbc As Long, ubr As Long, ubc As Long, rwOut As Long
Set dict = CreateObject("scripting.dictionary")
'input array bounds
lbr = LBound(arrIn, 1)
ubr = UBound(arrIn, 1)
lbc = LBound(arrIn, 2)
ubc = UBound(arrIn, 2)
ReDim keys(lbr To ubr)
'First pass:collect all the row "keys" in an array
' and unique keys in a dictionary
For rw = lbr To ubr
k = "": sep = ""
For col = lbc To ubc
k = k & sep & arrIn(rw, col)
sep = Chr(0)
Next col
keys(rw) = k 'collect key for this row
dict(k) = True 'just collecting unique keys
Next rw
'Resize output array to # of unique rows
ReDim arrout(lbr To dict.Count + (lbr - 1), lbc To ubc)
rwOut = lbr
'Second pass: copy each unique row to the output array
For rw = lbr To ubr
If dict(keys(rw)) Then 'not yet output?
For col = lbc To ubc 'copying this row over to output...
arrout(rwOut, col) = arrIn(rw, col)
Next col
rwOut = rwOut + 1 'increment output "row"
dict(keys(rw)) = False 'flag this key as copied
End If
Next rw
UniqueRows = arrout
End Function
Answer from #RBILLC could be easily improved by adding an Exit For inside internal loop:
Function eliminateDuplicate(poArr As Variant) As Variant
Dim poArrNoDup()
dupArrIndex = -1
For i = LBound(poArr) To UBound(poArr)
dupBool = False
For j = LBound(poArr) To i
If poArr(i) = poArr(j) And Not i = j Then
dupBool = True
Exit For
End If
Next j
If dupBool = False Then
dupArrIndex = dupArrIndex + 1
ReDim Preserve poArrNoDup(dupArrIndex)
poArrNoDup(dupArrIndex) = poArr(i)
End If
Next i
eliminateDuplicate = poArrNoDup
End Function
I think this is really a case for using excel's native functions, at least for the initial array acquisition, and I don't think there's any simpler way to do it. This sub will output the unique values starting in column 5. I assumed that the target range was empty, so if it's not, change r and c.
Sub testUniques()
Dim arr, r As Long, c As Long, h As Long, w As Long
Dim this As Worksheet: Set this = ActiveSheet
arr = Application.Unique(this.Cells(1, 1).CurrentRegion)
r = 1
c = 5
h = UBound(arr, 1) - 1
w = UBound(arr, 2) - 1
this.Range(this.Cells(r, c), this.Cells(r + h, c + w)) = arr
End Sub
I know this is old, but here's something I used to copy duplicate values to another range so that I could see them quickly to establish data integrity for a database I was standing up from various spreadsheets. To make the procedure delete the duplicates it would be as simple as replacing the dupRng lines with Cell.Delete Shift:=xlToLeft or something to that effect.
I haven't tested that personally, but it should work.
Sub PartCompare()
Dim partRng As Range, partArr() As Variant, i As Integer
Dim Cell As Range, lrow As Integer
lrow = ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
i = 0
Set partRng = ThisWorkbook.Worksheets("Sheet1").Range(Cells(1, 1), Cells(lrow, 1))
For Each Cell In partRng.Cells
ReDim Preserve partArr(i)
partArr(i) = Cell.Value
i = i + 1
Next
Dim dupRng As Range, j As Integer, x As Integer, c As Integer
Set dupRng = ThisWorkbook.Worksheets("Sheet1").Range("D1")
x = 0
c = 1
For Each Cell In partRng.Cells
For j = c To UBound(partArr)
If partArr(j) = Cell.Value Then
dupRng.Offset(x, 0).Value = Cell.Value
dupRng.Offset(x, 1).Value = Cell.Address()
x = x + 1
Exit For
End If
Next j
c = c + 1
Next Cell
End Sub
Remove duplicates (plus related row items) from array
As OP wanted a VBA solution close to RemoveDuplicates, I demonstrate an array approach using a â–ºdictionary to get not the unique items per se (dict.keys), but the related row indices of first occurrencies (dict.items).
These are used to retain the whole row data via procedure LeaveUniques profiting from the advanced possibilities of the â–ºApplication.Index() function - c.f. Some peculiarities of the the Application.Index function
Example Call
Sub ExampleCall()
'[0]define range and assign data to 1-based 2-dim datafield
With Sheet1 ' << reference to your project's sheet Code(Name)
Dim lastRow: lastRow = .Cells(.Rows.Count, "C").End(xlUp).Row
Dim rng: Set rng = .Range("C2:E" & lastRow)
End With
Dim data: data = rng ' assign data to 2-dim datafield
'[1]get uniques (column 1) and remove duplicate rows
LeaveUniques data ' << call procedure LeaveUniques (c.f. RemoveDuplicates)
'[2]overwrite original range
rng.Clear
rng.Resize(UBound(data), UBound(data, 2)) = data
End Sub
Procedure LeaveUniques
Sub LeaveUniques(ByRef data As Variant, Optional ByVal colNum As Long = 1)
'Purpose: procedure removes duplicates of given column number in entire array
data = Application.Index(data, uniqueRowIndices(data, colNum), nColIndices(UBound(data, 2)))
End Sub
Help functions to LeaveUniques
Function uniqueRowIndices(data, Optional ByVal colNum As Long = 1)
'Purpose: return data index numbers referring to uniques
'a) set late bound dictionary to memory
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
'b) slice e.g. first data column (colNum = 1)
Dim colData
colData = Application.Index(data, 0, colNum)
'c) fill dictionary with uniques referring to first occurencies
Dim i As Long
For i = 1 To UBound(colData)
If Not dict.exists(dict(colData(i, 1))) Then dict(colData(i, 1)) = i
Next
'd) return 2-dim array of valid unique 1-based index numbers
uniqueRowIndices = Application.Transpose(dict.items)
End Function
Function nColIndices(ByVal n As Long)
'Purpose: return "flat" array of n column indices, e.g. for n = 3 ~> Array(1, 2, 3)
nColIndices = Application.Transpose(Evaluate("row(1:" & n & ")"))
End Function

Resources