How to pivot duplicate rows to columns? - excel

having a hard time figuring out how to pivot a multi-column data set with duplicate rows into unique columns.
I have done research and found some VBA scripts to do this, but it is resulting in data missing when I do counts to confirm it pivoted correctly and ends up adding in duplicate columns (name/ rating year) over and over.
Anyone have any ideas? I'd do a pivot table, but I can't display the actual rating values in a pivot, only a sum/count/avg. etc...

You can do this easily in powerquery.
Highlight all your data, then insert>add table
data tab>get data from table
highlight right two columns>pivot columns
rating level as values
advanced options>don't aggregate
find and replace null with nothing
save and close

Pivot Data
The Code
Option Explicit
Sub pivotData()
' Define Source Range.
Dim rng As Range
Set rng = Range("A1").CurrentRegion
' Get unique values.
Dim prs As Variant
prs = getUniqueColumn1D(rng.Columns(1).Resize(rng.Rows.Count - 1).Offset(1))
Dim yrs As Variant
yrs = getUniqueColumn1D(rng.Columns(2).Resize(rng.Rows.Count - 1).Offset(1))
sort1D yrs
' Source Range to Source Array.
Dim Source As Variant
Source = rng.Value
' Define Target Array.
Dim Target As Variant
ReDim Target(1 To UBound(prs) - LBound(prs) + 2, _
1 To UBound(yrs) - LBound(yrs) + 2)
' Write from arrays to Target Array.
Target(1, 1) = Source(1, 1)
Dim n As Long
Dim i As Long
i = 1
For n = LBound(prs) To UBound(prs)
i = i + 1
Target(i, 1) = prs(n)
Next n
Dim j As Long
j = 1
For n = LBound(yrs) To UBound(yrs)
j = j + 1
Target(1, j) = yrs(n)
Next n
For n = 2 To UBound(Source, 1)
i = Application.Match(Source(n, 1), prs, 0) + 1
j = Application.Match(Source(n, 2), yrs, 0) + 1
Target(i, j) = Source(n, 3)
Next n
' Define Target Range.
Set rng = Range("E1").Resize(UBound(Target, 1), UBound(Target, 2))
' Write from Target Array to Target Range.
rng.Value = Target
' Inform user.
MsgBox "Data transferred.", vbInformation, "Success"
End Sub
' Returns the unique values from a column range.
Function getUniqueColumn1D(ColumnRange As Range, _
Optional ByVal Sorted As Boolean = False) _
As Variant
Dim Data As Variant
Data = ColumnRange.Columns(1).Value
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
Dim Key As Variant
Dim i As Long
For i = 1 To UBound(Data, 1)
Key = Data(i, 1)
If Not IsError(Key) And Not IsEmpty(Key) Then
.Item(Key) = Empty
End If
Next i
If .Count > 0 Then
getUniqueColumn1D = .Keys
End If
End With
End Function
' Sorts a 1D array only if it contains values of the same data type.
Sub sort1D(ByRef OneD As Variant, _
Optional ByVal Descending As Boolean = False)
With CreateObject("System.Collections.ArrayList")
Dim i As Long
For i = LBound(OneD) To UBound(OneD)
.Add OneD(i)
Next i
.Sort
If Descending Then
.Reverse
End If
OneD = .ToArray
End With
End Sub

Related

Column Table convert Matrix Table in VBA code

Current i using formula(index and Match) to create matrix i wish using VBA coding, this will make more fast compair to formula. Thanks in advance
enter image description here
Sub columntomatrix
Dim mS As Worksheet
Dim eS As Worksheet
Set mS = ThisWorkbook.Sheets("Matrix")
Set eS = ThisWorkbook.Sheets("Price Entry Book")
Dim Matrix() As String
Dim entryPrice() As String
Dim Product As Range
Dim PriceBook As Range
Set Product = Range("Product")
Set PriceBook = Range("PriceBookName")
With mS.Range("B2")
.Formula = "=IFERROR(INDEX(ListPrice,
MATCH(" & .Offset(0,-1).Address(False, True) & "&" &
.Offset(-1, 0).Address(True, False) & ",ProductKey,0)),"" N/A "")"
Product.Copy
'offset(0,-1) = selected cells move to left 1 column'
.Offset(0, -1).PasteSpecial
PriceBook.Copy
.offset(-1,0) = selected cells move to up 1 row'
.Offset(-1, 0).PasteSpecial Paste:=xlPasteAll, Transpose:=True
With Range(.Offset(0, 0), .Offset(Product.Rows.Count - 2, PriceBook.Rows.Count - 2))
.FillDown
.FillRight
End with
End with
End Sub
Pivot RCV
Copy all four procedures to a standard module, e.g. Module1.
Carefully adjust the values in the Define constants. section of pivotRCV.
Only run the first procedure pivotRCV, the others are being called by it.
The Code
Option Explicit
Sub pivotRCV() ' RCV: Row Labels, Column Labels, and Values
' Define constants.
' Define Source constants.
Const srcName As String = "Price Entry Book"
Const srcFirst As String = "A2"
Const rlCol As Long = 1
Const clCol As Long = 2
Const vCol As Long = 4
Const rlSort As Boolean = False
Const clSort As Boolean = False
' Define Target constants.
Const tgtName As String = "Matrix"
Const tgtFirst As String = "A2"
' Define workbooks.
Dim src As Workbook
Set src = ThisWorkbook
Dim tgt As Workbook
Set tgt = ThisWorkbook
' Define Source Range.
' Define Source Worksheet.
Dim ws As Worksheet
Set ws = src.Worksheets(srcName)
' Define Source Range.
Dim rng As Range
Set rng = defineEndRange(ws.Range(srcFirst))
' Write values from Source Range to arrays.
' Write values from Source Range to 1D Unique Row Labels Array.
Dim rLabels As Variant
rLabels = getUniqueColumn1D(rng.Columns(rlCol).Resize(rng.Rows.Count - 1) _
.Offset(1))
If rlSort Then
sort1D rLabels
End If
' Write values from Source Range to 1D Unique Column Labels Array.
Dim cLabels As Variant
cLabels = getUniqueColumn1D(rng.Columns(clCol).Resize(rng.Rows.Count - 1) _
.Offset(1))
If clSort Then
sort1D cLabels
End If
' Write values from Source Range to 2D Source Array.
Dim Source As Variant
Source = rng.Value
' Prepare to write values from Source Array to Target Array.
' Define Target Array.
Dim Target As Variant
ReDim Target(1 To UBound(rLabels) - LBound(rLabels) + 2, _
1 To UBound(cLabels) - LBound(cLabels) + 2)
' Define counters.
Dim n As Long
Dim i As Long
i = 1
' Write values from Source Arrays to Target Array.
' Write first row/column label.
Target(1, 1) = Source(1, 1)
' Write row labels.
For n = LBound(rLabels) To UBound(rLabels)
i = i + 1
Target(i, 1) = rLabels(n)
Next n
' Write column labels.
Dim j As Long
j = 1
For n = LBound(cLabels) To UBound(cLabels)
j = j + 1
Target(1, j) = cLabels(n)
Next n
' Write values.
For n = 2 To UBound(Source, 1)
i = Application.Match(Source(n, rlCol), rLabels, 0) + 1
j = Application.Match(Source(n, clCol), cLabels, 0) + 1
Target(i, j) = Source(n, vCol)
Next n
' Write values from Target Array to Target Range.
' Define Target Worksheet.
Set ws = tgt.Worksheets(tgtName)
' Define Target First Row Range.
With ws.Range(tgtFirst).Resize(, UBound(Target, 2))
' Clear contents from Target First Row Range to the bottom-most row.
.Resize(ws.Rows.Count - .Row + 1).ClearContents
' Define Target Range.
Set rng = .Resize(UBound(Target, 1))
End With
' Write values from Target Array to Target Range.
rng.Value = Target
' Inform user.
MsgBox "Data transferred.", vbInformation, "Success"
End Sub
' Defines the range from a specified first cell to the last cell
' of its Current Region.
Function defineEndRange(FirstCellRange As Range) _
As Range
' Define Current Region ('rng').
Dim rng As Range
Set rng = FirstCellRange.CurrentRegion
' Define End Range.
Set defineEndRange = FirstCellRange _
.Resize(rng.Rows.Count + rng.Row - FirstCellRange.Row, _
rng.Columns.Count + rng.Column - FirstCellRange.Column)
End Function
' Returns the unique values from a column range.
Function getUniqueColumn1D(ColumnRange As Range, _
Optional ByVal Sorted As Boolean = False) _
As Variant
Dim Data As Variant
Data = ColumnRange.Columns(1).Value
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
Dim Key As Variant
Dim i As Long
For i = 1 To UBound(Data, 1)
Key = Data(i, 1)
If Not IsError(Key) And Not IsEmpty(Key) Then
.Item(Key) = Empty
End If
Next i
getUniqueColumn1D = .Keys
End With
End Function
' Sorts a 1D array only if it contains the same data type.
Sub sort1D(ByRef OneD As Variant, _
Optional ByVal Descending As Boolean = False)
With CreateObject("System.Collections.ArrayList")
Dim i As Long
For i = LBound(OneD) To UBound(OneD)
.Add OneD(i)
Next i
.Sort
If Descending Then
.Reverse
End If
OneD = .ToArray
End With
End Sub

Getting the maximum value of a specific column in a 2d array [duplicate]

I use the code hereunder to calculate max values as described in this post (vba max value of group of values). The code works great but once I have more than 65k lines I get a data type mismatch when trying to pase the array:
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
Could somebody help me to slice the array in chunks. I have tried to get it working myself but without any luck.
Sub FillGroupsMax()
Dim lColumn As Long
Dim sht As Worksheet
Dim groupsArray As Variant 'array with all group infomation
Dim groupsSeen As Variant 'array with group infomation already seen
Application.ScreenUpdating = False 'stop screen updating makes vba perform better
Set sht = ThisWorkbook.Worksheets("import")
Set last = sht.Range("A:A").Find("*", Cells(1, 1), searchdirection:=xlPrevious) 'last cell with value in column A
lColumn = sht.Cells(1, Columns.Count).End(xlToLeft).Column
groupsArray = sht.Range(Cells(1, 1), Cells(last.Row, lColumn))
'collect all the information on the Sheet into an array
'Improves performance by not visiting the sheet
For dRow = 2 To last.Row 'for each of the rows skipping header
'check if group as already been seen
If inArrayValue(Cells(dRow, 1).Value, groupsSeen) > 0 Then
'if it has been seen/calculated attribute value
'Cells(dRow, 4).Value = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
groupsArray(dRow, lColumn) = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
Else
'if it hasn't been seen then find max
'Cells(dRow, 4).Value = getMax(Cells(dRow, 1).Value, groupsArray)
groupsArray(dRow, lColumn) = getMax(Cells(dRow, 1).Value, groupsArray, lColumn)
'array construction from empty
If IsEmpty(groupsSeen) Then
ReDim groupsSeen(0)
'groupsSeen(0) = Array(Cells(dRow, 1).Value, Cells(dRow, 4).Value)
groupsSeen(0) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
'attribute value to array
Else
ReDim Preserve groupsSeen(0 To UBound(groupsSeen) + 1)
groupsSeen(UBound(groupsSeen)) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
End If
End If
Next
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
'reactivate Screen updating
Application.ScreenUpdating = True
End Sub
Function getMax(group As String, groupsArray As Variant, lColumn As Long) As Double
'for each in array
For n = 1 To UBound(groupsArray)
'if its the same group the Max we seen so far the record
If groupsArray(n, 1) = group And groupsArray(n, lColumn - 1) > maxSoFar Then
maxSoFar = groupsArray(n, lColumn - 1)
End If
Next
'set function value
getMax = maxSoFar
End Function
Function inArrayValue(group As String, groupsSeen As Variant) As Double
'set function value
inArrayValue = 0
'if array is empty then exit
If IsEmpty(groupsSeen) Then Exit Function
'for each in array
For n = 0 To UBound(groupsSeen)
'if we find the group
If groupsSeen(n)(0) = group Then
'set function value to the Max value already seen
inArrayValue = groupsSeen(n)(1)
'exit function earlier
Exit Function
End If
Next
End Function
You can write a helper function to use instead of Application.Index
Bonus - it will be much faster than using Index (>5x)
Sub Tester()
Dim arr, arrCol
arr = Range("A2:J80000").Value
arrCol = GetColumn(arr, 5) '<< get the fifth column
Range("L2").Resize(UBound(arrCol, 1), 1).Value = arrCol
End Sub
'extract a single column from a 1-based 2-D array
Function GetColumn(arr, colNumber)
Dim arrRet, i As Long
ReDim arrRet(1 To UBound(arr, 1), 1 To 1)
For i = 1 To UBound(arr, 1)
arrRet(i, 1) = arr(i, colNumber)
Next i
GetColumn = arrRet
End Function
EDIT - since QHarr asked about timing here's a basic example
Sub Tester()
Dim arr, arrCol, t, i as long
arr = Range("A2:J80000").Value
t = Timer
For i = 1 to 100
arrCol = GetColumn(arr, 5) '<< get the fifth column
Next i
Debug.print Timer - t '<<# of seconds for execution
End Sub
Below, whilst not as tidy as could be, is a way to process an array in chunks and Index to access a column and write out to the sheet.
I populated two columns (A:B) with data. Both had 132,000 rows, populated incrementally, with values from 1 to 132,000 in each column for my test run.
You can fiddle with cutOff to get the chunk size just below the point where the fail happens.
The code below is simply to demonstrate the principle of looping in batches, upto the set cutoff in each batch, until all rows have been processed.
Option Explicit
Public Sub WriteArrayToSheet()
Dim wb As Workbook
Dim ws As Worksheet
Set wb = ThisWorkbook
Set sht = wb.Worksheets("Sheet1") 'change as appropriate
Dim myArr() 'dynamic array
myArr = sht.Range("A1").CurrentRegion.Value 'you may want a more robust method
Dim cutOff As Long 'the max value - what ever it is before error occurs
cutOff = 1000
Dim totalRows As Long 'total rows in array read in from sheet
totalRows = UBound(myArr, 1)
Dim totalArraysNeeded As Long
'Determine how many lots of cutOff chunks there are in the total number of array rows
totalArraysNeeded = Application.WorksheetFunction.Ceiling(totalRows / cutOff, 1)
Dim rotations As Long 'number of times to loop original array to handle all rows
Dim rowCountTotal As Long
Dim rowCount As Long
Dim tempArr() 'this will hold the chunk of the original array
Dim rowCounter As Long
Dim lastRow As Long
Dim nextRow As Long
Dim i As Long
Dim j As Long
Dim numRows As Long
rotations = 1
Do While rotations < totalArraysNeeded
If rotations < totalArraysNeeded - 1 Then
ReDim tempArr(1 To cutOff, 1 To UBound(myArr, 2)) 'size chunk array
numRows = cutOff
Else
numRows = totalRows - rowCountTotal
ReDim tempArr(1 To numRows, 1 To UBound(myArr, 2)) 'size chunk array
End If
For i = 1 To numRows
rowCount = 1 'rows in this chunk looped
rowCountTotal = rowCountTotal + 1 'rows in original array looped
For j = LBound(myArr, 2) To UBound(myArr, 2)
tempArr(i, j) = myArr(rowCountTotal, j)
Next j
rowCount = rowCount + 1
Next i
With sht
lastRow = .Cells(.Rows.Count, "E").End(xlUp).Row 'Column where I am writing the sliced column out to
End With
If lastRow = 1 Then
nextRow = 1
Else
nextRow = lastRow + 1
End If
sht.Range("E" & nextRow).Resize(UBound(tempArr, 1), 1) = Application.Index(tempArr, , 1) 'write out to sheet
rotations = rotations + 1
Loop
End Sub
As #Tim suggested, the best way to slice a large array is use a loop to copy the column.
Though in your case, most of the processing time is spent on computing the maximum since your code is using a nested loop.
If you want to reduce significantly the processing time, then use a dictionary:
Sub Usage
GetMaxByGroupTo _
sourceGroups := ThisWorkbook.Range("Sheet1!A2:A100"), _
sourceValues := ThisWorkbook.Range("Sheet1!B2:B100"), _
target := ThisWorkbook.Range("Sheet1!C2")
End Sub
Sub GetMaxByGroupTo(sourceGroups As Range, sourceValues As Range, target As Range)
Dim dict As Object, groups(), values(), r As Long, max
Set dict = CreateObject("Scripting.Dictionary")
groups = sourceGroups.Value2
values = sourceValues.Value2
' store the maximum value of each group in a dictionary for an efficient lookup '
For r = Lbound(groups) to Ubound(groups)
max = dict(groups(r, 1))
If VarType(max) And values(r, 1) <= max Then Else dict(groups(r, 1)) = values(r, 1)
Next
' build and copy the result array to the sheet '
For r = Lbound(groups) to Ubound(groups)
values(r, 1) = dict(groups(r, 1))
Next
target.Resize(Ubound(groups), 1).Value2 = values
End Sub

Unique values two columns combobox vba

I need to display two columns A and B listed in a combobox with unique values. So if two rows have the same A but not the same B, it is not a duplicate, both column need to be duplicate. I found a code that list one column (A) with unique values but I don't know how to add the column B.
There's a picture of my data and how I want to display it in my ComboBox.
Here's the code:
Private Sub UserForm_Initialize()
Dim Cell As Range
Dim col As Variant
Dim Descending As Boolean
Dim Entries As Collection
Dim Items As Variant
Dim index As Long
Dim j As Long
Dim RngBeg As Range
Dim RngEnd As Range
Dim row As Long
Dim Sorted As Boolean
Dim temp As Variant
Dim test As Variant
Dim Wks As Worksheet
Set Wks = ThisWorkbook.Worksheets("Sheet1")
Set RngBeg = Wks.Range("A3")
col = RngBeg.Column
Set RngEnd = Wks.Cells(Rows.Count, col).End(xlUp)
Set Entries = New Collection
ReDim Items(0)
For row = RngBeg.row To RngEnd.row
Set Cell = Wks.Cells(row, col)
On Error Resume Next
test = Entries(Cell.Text)
If Err = 5 Then
Entries.Add index, Cell.Text
Items(index) = Cell.Text
index = index + 1
ReDim Preserve Items(index)
End If
On Error GoTo 0
Next row
index = index - 1
Descending = False
ReDim Preserve Items(index)
Do
Sorted = True
For j = 0 To index - 1
If Descending Xor StrComp(Items(j), Items(j + 1), vbTextCompare) = 1 Then
temp = Items(j + 1)
Items(j + 1) = Items(j)
Items(j) = temp
Sorted = False
End If
Next j
index = index - 1
Loop Until Sorted Or index < 1
ComboBox1.List = Items
End Sub
Any clue? Thanks!
Try this code, please. It assumes that unique definition means pairs of values from the two columns, on the same row, to be unique:
Sub UnicTwoValInTwoColumns()
Dim sh As Worksheet, arr As Variant, arrFin As Variant, countD As Long
Dim lastRow As Long, i As Long, j As Long, k As Long, boolDupl As Boolean
Set sh = ActiveSheet 'use here your sheet
'supposing that last row in column A:A is the same in column B:B
'If not, the last row for B:B will be calculated and then the higher will be chosen:
lastRow = sh.Range("A" & Rows.Count).End(xlUp).Row
ReDim arrFin(1 To 2, 1 To lastRow) 'redim the final array for maximum possible number of elements
arr = sh.Range("A3:B" & lastRow).value 'pun in array the range to be analized
k = 1 'initialize the first array element number
For i = 1 To UBound(arr, 1) 'iterate between the array elements
boolDupl = False 'initialize the variable proving that the pair of data already in arrFin
For j = 1 To k 'iterate between the arrFin elements in order to check for duplicates
If arr(i, 1) & arr(i, 2) = arrFin(1, j) & arrFin(2, j) Then
boolDupl = True: Exit For 'if a duplicate is found the loop is exited
End If
Next j
If Not boolDupl Then 'load the arrFin only if a duplicate has not been found
arrFin(1, k) = arr(i, 1): arrFin(2, k) = arr(i, 2)
k = k + 1 'increment the (real) array number of elements
End If
Next
ReDim Preserve arrFin(1 To 2, 1 To k - 1) 'redim array at the real dimension (preserving values)
With Me.ComboBox1
.ColumnCount = 2 'be sure that combo has 2 columns to receive values
.List = WorksheetFunction.Transpose(arrFin) 'fill the combo with the array elements
End With
End Sub
You can paste the code in the form Initialize event, or let the Sub like it is, copy it in the form module and only call it from the event in discussion. I would suggest you to proceed in this las way. If you have (or will have) something else in the event, it would be simpler to identify a problem if it occurs, I think,

Remove duplicates from array using VBA

Assume I have a block of data in Excel 2010, 100 rows by 3 columns.
Column C contains some duplicates, say it starts off as
1, 1, 1, 2, 3, 4, 5, ..... , 97, 98
Using VBA, I would like to remove the duplicate rows so I am left with 98 rows and 3 columns.
1, 2, 3, ..... , 97, 98
I know there is a button in Excel 2010 to do that but it inteferes with the rest of my code subsequently and gives incorrect results.
Furthermore, I would like to do it in arrays, then paste the results on the worksheet, rather than methods such as Application.Worksheetfunction.countif(.....
So something like:
Dim myarray() as Variant
myarray=cells(1,1).Currentregion.value
Dim a as Long
For a=1 to Ubound(myarray,1)
'something here to
Next a
I answered a similar question. Here is the code I used:
Dim dict As Object
Dim rowCount As Long
Dim strVal As String
Set dict = CreateObject("Scripting.Dictionary")
rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count
'you can change the loop condition to iterate through the array rows instead
Do While rowCount > 1
strVal = Sheet1.Cells(rowCount, 1).Value2
If dict.exists(strVal) Then
Sheet1.Rows(rowCount).EntireRow.Delete
Else
'if doing this with an array, then add code in the Else block
' to assign values from this row to the array of unique values
dict.Add strVal, 0
End If
rowCount = rowCount - 1
Loop
Set dict = Nothing
If you want to use an array, then loop through the elements with the same conditional (if/else) statements. If the item doesn't exist in the dictionary, then you can add it to the dictionary and add the row values to another array.
Honestly, I think the most efficient way is to adapt code you'd get from the macro recorder. You can perform the above function in one line:
Sheet1.UsedRange.RemoveDuplicates Columns:=3, Header:=xlYes
Function eliminateDuplicate(poArr As Variant) As Variant
Dim poArrNoDup()
dupArrIndex = -1
For i = LBound(poArr) To UBound(poArr)
dupBool = False
For j = LBound(poArr) To i
If poArr(i) = poArr(j) And Not i = j Then
dupBool = True
End If
Next j
If dupBool = False Then
dupArrIndex = dupArrIndex + 1
ReDim Preserve poArrNoDup(dupArrIndex)
poArrNoDup(dupArrIndex) = poArr(i)
End If
Next i
eliminateDuplicate = poArrNoDup
End Function
Simple function to remove duplicates from a 1D array
Private Function DeDupeArray(vArray As Variant) As Variant
Dim oDict As Object, i As Long
Set oDict = CreateObject("Scripting.Dictionary")
For i = LBound(vArray) To UBound(vArray)
oDict(vArray(i)) = True
Next
DeDupeArray = oDict.keys()
End Function
Edit:
With stdVBA (a library largely maintained by myself) you can use:
uniqueValues = stdEnumerator.CreateFromArray(myArray).Unique().AsArray()
An improvement on #RBILLC and #radoslav006 answers, this version searches the array with the duplicates removed for existing values so it searchs less values to find a duplicate.
Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
Dim duplicateFound As Boolean
Dim arrayIndex As Integer, i As Integer, j As Integer
Dim deduplicatedArray() As Variant
arrayIndex = -1
deduplicatedArray = Array(1)
For i = LBound(sourceArray) To UBound(sourceArray)
duplicateFound = False
For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
If sourceArray(i) = deduplicatedArray(j) Then
duplicateFound = True
Exit For
End If
Next j
If duplicateFound = False Then
arrayIndex = arrayIndex + 1
ReDim Preserve deduplicatedArray(arrayIndex)
deduplicatedArray(arrayIndex) = sourceArray(i)
End If
Next i
RemoveDuplicatesFromArray = deduplicatedArray
End Function
Here's another approach for working with an array:
Sub tester()
Dim arr, arrout
arr = Range("A1").CurrentRegion.Value 'collect the input array
arrout = UniqueRows(arr) 'get only unique rows
Range("H1").Resize(UBound(arrout, 1), UBound(arrout, 2)).Value = arrout
End Sub
Function UniqueRows(arrIn As Variant) As Variant
Dim keys, rw As Long, col As Long, k, sep, arrout
Dim dict As Object, lbr As Long, lbc As Long, ubr As Long, ubc As Long, rwOut As Long
Set dict = CreateObject("scripting.dictionary")
'input array bounds
lbr = LBound(arrIn, 1)
ubr = UBound(arrIn, 1)
lbc = LBound(arrIn, 2)
ubc = UBound(arrIn, 2)
ReDim keys(lbr To ubr)
'First pass:collect all the row "keys" in an array
' and unique keys in a dictionary
For rw = lbr To ubr
k = "": sep = ""
For col = lbc To ubc
k = k & sep & arrIn(rw, col)
sep = Chr(0)
Next col
keys(rw) = k 'collect key for this row
dict(k) = True 'just collecting unique keys
Next rw
'Resize output array to # of unique rows
ReDim arrout(lbr To dict.Count + (lbr - 1), lbc To ubc)
rwOut = lbr
'Second pass: copy each unique row to the output array
For rw = lbr To ubr
If dict(keys(rw)) Then 'not yet output?
For col = lbc To ubc 'copying this row over to output...
arrout(rwOut, col) = arrIn(rw, col)
Next col
rwOut = rwOut + 1 'increment output "row"
dict(keys(rw)) = False 'flag this key as copied
End If
Next rw
UniqueRows = arrout
End Function
Answer from #RBILLC could be easily improved by adding an Exit For inside internal loop:
Function eliminateDuplicate(poArr As Variant) As Variant
Dim poArrNoDup()
dupArrIndex = -1
For i = LBound(poArr) To UBound(poArr)
dupBool = False
For j = LBound(poArr) To i
If poArr(i) = poArr(j) And Not i = j Then
dupBool = True
Exit For
End If
Next j
If dupBool = False Then
dupArrIndex = dupArrIndex + 1
ReDim Preserve poArrNoDup(dupArrIndex)
poArrNoDup(dupArrIndex) = poArr(i)
End If
Next i
eliminateDuplicate = poArrNoDup
End Function
I think this is really a case for using excel's native functions, at least for the initial array acquisition, and I don't think there's any simpler way to do it. This sub will output the unique values starting in column 5. I assumed that the target range was empty, so if it's not, change r and c.
Sub testUniques()
Dim arr, r As Long, c As Long, h As Long, w As Long
Dim this As Worksheet: Set this = ActiveSheet
arr = Application.Unique(this.Cells(1, 1).CurrentRegion)
r = 1
c = 5
h = UBound(arr, 1) - 1
w = UBound(arr, 2) - 1
this.Range(this.Cells(r, c), this.Cells(r + h, c + w)) = arr
End Sub
I know this is old, but here's something I used to copy duplicate values to another range so that I could see them quickly to establish data integrity for a database I was standing up from various spreadsheets. To make the procedure delete the duplicates it would be as simple as replacing the dupRng lines with Cell.Delete Shift:=xlToLeft or something to that effect.
I haven't tested that personally, but it should work.
Sub PartCompare()
Dim partRng As Range, partArr() As Variant, i As Integer
Dim Cell As Range, lrow As Integer
lrow = ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
i = 0
Set partRng = ThisWorkbook.Worksheets("Sheet1").Range(Cells(1, 1), Cells(lrow, 1))
For Each Cell In partRng.Cells
ReDim Preserve partArr(i)
partArr(i) = Cell.Value
i = i + 1
Next
Dim dupRng As Range, j As Integer, x As Integer, c As Integer
Set dupRng = ThisWorkbook.Worksheets("Sheet1").Range("D1")
x = 0
c = 1
For Each Cell In partRng.Cells
For j = c To UBound(partArr)
If partArr(j) = Cell.Value Then
dupRng.Offset(x, 0).Value = Cell.Value
dupRng.Offset(x, 1).Value = Cell.Address()
x = x + 1
Exit For
End If
Next j
c = c + 1
Next Cell
End Sub
Remove duplicates (plus related row items) from array
As OP wanted a VBA solution close to RemoveDuplicates, I demonstrate an array approach using a ►dictionary to get not the unique items per se (dict.keys), but the related row indices of first occurrencies (dict.items).
These are used to retain the whole row data via procedure LeaveUniques profiting from the advanced possibilities of the ►Application.Index() function - c.f. Some peculiarities of the the Application.Index function
Example Call
Sub ExampleCall()
'[0]define range and assign data to 1-based 2-dim datafield
With Sheet1 ' << reference to your project's sheet Code(Name)
Dim lastRow: lastRow = .Cells(.Rows.Count, "C").End(xlUp).Row
Dim rng: Set rng = .Range("C2:E" & lastRow)
End With
Dim data: data = rng ' assign data to 2-dim datafield
'[1]get uniques (column 1) and remove duplicate rows
LeaveUniques data ' << call procedure LeaveUniques (c.f. RemoveDuplicates)
'[2]overwrite original range
rng.Clear
rng.Resize(UBound(data), UBound(data, 2)) = data
End Sub
Procedure LeaveUniques
Sub LeaveUniques(ByRef data As Variant, Optional ByVal colNum As Long = 1)
'Purpose: procedure removes duplicates of given column number in entire array
data = Application.Index(data, uniqueRowIndices(data, colNum), nColIndices(UBound(data, 2)))
End Sub
Help functions to LeaveUniques
Function uniqueRowIndices(data, Optional ByVal colNum As Long = 1)
'Purpose: return data index numbers referring to uniques
'a) set late bound dictionary to memory
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
'b) slice e.g. first data column (colNum = 1)
Dim colData
colData = Application.Index(data, 0, colNum)
'c) fill dictionary with uniques referring to first occurencies
Dim i As Long
For i = 1 To UBound(colData)
If Not dict.exists(dict(colData(i, 1))) Then dict(colData(i, 1)) = i
Next
'd) return 2-dim array of valid unique 1-based index numbers
uniqueRowIndices = Application.Transpose(dict.items)
End Function
Function nColIndices(ByVal n As Long)
'Purpose: return "flat" array of n column indices, e.g. for n = 3 ~> Array(1, 2, 3)
nColIndices = Application.Transpose(Evaluate("row(1:" & n & ")"))
End Function

Populate unique values into a VBA array from Excel

Can anyone give me VBA code that will take a range (row or column) from an Excel sheet and populate a list/array with the unique values,
i.e.:
table
table
chair
table
stool
stool
stool
chair
when the macro runs would create an array some thing like:
fur[0]=table
fur[1]=chair
fur[2]=stool
Sub GetUniqueAndCount()
Dim d As Object, c As Range, k, tmp As String
Set d = CreateObject("scripting.dictionary")
For Each c In Selection
tmp = Trim(c.Value)
If Len(tmp) > 0 Then d(tmp) = d(tmp) + 1
Next c
For Each k In d.keys
Debug.Print k, d(k)
Next k
End Sub
In this situation I always use code like this (just make sure delimeter you've chosen is not a part of search range)
Dim tmp As String
Dim arr() As String
If Not Selection Is Nothing Then
For Each cell In Selection
If (cell <> "") And (InStr(tmp, cell) = 0) Then
tmp = tmp & cell & "|"
End If
Next cell
End If
If Len(tmp) > 0 Then tmp = Left(tmp, Len(tmp) - 1)
arr = Split(tmp, "|")
Combining the Dictionary approach from Tim with the variant array from Jean_Francois below.
The array you want is in objDict.keys
Sub A_Unique_B()
Dim X
Dim objDict As Object
Dim lngRow As Long
Set objDict = CreateObject("Scripting.Dictionary")
X = Application.Transpose(Range([a1], Cells(Rows.Count, "A").End(xlUp)))
For lngRow = 1 To UBound(X, 1)
objDict(X(lngRow)) = 1
Next
Range("B1:B" & objDict.Count) = Application.Transpose(objDict.keys)
End Sub
This is the old-school way of doing it.
It will execute faster than looping through cells (e.g. For Each cell In Selection) and will be reliable no matter what, as long you have a rectangular selection (i.e. not Ctrl-selecting a bunch of random cells).
Sub FindUnique()
Dim varIn As Variant
Dim varUnique As Variant
Dim iInCol As Long
Dim iInRow As Long
Dim iUnique As Long
Dim nUnique As Long
Dim isUnique As Boolean
varIn = Selection
ReDim varUnique(1 To UBound(varIn, 1) * UBound(varIn, 2))
nUnique = 0
For iInRow = LBound(varIn, 1) To UBound(varIn, 1)
For iInCol = LBound(varIn, 2) To UBound(varIn, 2)
isUnique = True
For iUnique = 1 To nUnique
If varIn(iInRow, iInCol) = varUnique(iUnique) Then
isUnique = False
Exit For
End If
Next iUnique
If isUnique = True Then
nUnique = nUnique + 1
varUnique(nUnique) = varIn(iInRow, iInCol)
End If
Next iInCol
Next iInRow
'// varUnique now contains only the unique values.
'// Trim off the empty elements:
ReDim Preserve varUnique(1 To nUnique)
End Sub
Profiting from the MS Excel 365 function UNIQUE()
In order to enrich the valid solutions above:
Sub ExampleCall()
Dim rng As Range: Set rng = Sheet1.Range("A2:A11") ' << change to your sheet's Code(Name)
Dim a: a = rng
a = getUniques(a)
arrInfo a
End Sub
Function getUniques(a, Optional ZeroBased As Boolean = True)
Dim tmp: tmp = Application.Transpose(WorksheetFunction.Unique(a))
If ZeroBased Then ReDim Preserve tmp(0 To UBound(tmp) - 1)
getUniques = tmp
End Function
OK I did it finally:
Sub CountUniqueRecords()
Dim Array() as variant, UniqueArray() as variant, UniqueNo as Integer,
Dim i as integer, j as integer, k as integer
Redim UnquiArray(1)
k= Upbound(array)
For i = 1 To k
For j = 1 To UniqueNo + 1
If Array(i) = UniqueArray(j) Then GoTo Nx
Next j
UniqueNo = UniqueNo + 1
ReDim Preserve UniqueArray(UniqueNo + 1)
UniqueArray(UniqueNo) = Array(i)
Nx:
Next i
MsgBox UniqueNo
End Sub
one more way ...
Sub get_unique()
Dim unique_string As String
lr = Sheets("data").Cells(Sheets("data").Rows.Count, 1).End(xlUp).Row
Set range1 = Sheets("data").Range("A2:A" & lr)
For Each cel In range1
If Not InStr(output, cel.Value) > 0 Then
unique_string = unique_string & cel.Value & ","
End If
Next
End Sub
This VBA function returns an array of distinct values when passed either a range or a 2D array source
It defaults to processing the first column of the source, but you can optionally choose another column.
I wrote a LinkedIn article about it.
Function DistinctVals(a, Optional col = 1)
Dim i&, v: v = a
With CreateObject("Scripting.Dictionary")
For i = 1 To UBound(v): .Item(v(i, col)) = 1: Next
DistinctVals = Application.Transpose(.Keys)
End With
End Function
The old school method was my favourite option. Thank you. And it was indeed fast. But I didn't use redim. Here though is my real world example where I accumulate values for each unique "key" found in a column and move it into a array (say for an employee and values are hours worked per day). Then I put each key with its final values into a totals area on the active sheet. I've commented extensively for anyone who wants painful detail on what is happening here. Limited error checking is done by this code.
Sub GetActualTotals()
'
' GetActualTotals Macro
'
' This macro accumulates values for each unique employee from the active
' spreadsheet.
'
' History
' October 2016 - Version 1
'
' Invocation
' I created a button labeled "Get Totals" on the Active Sheet that invokes
' this macro.
'
Dim ResourceName As String
Dim TotalHours As Double
Dim TotalPercent As Double
Dim IsUnique As Boolean
Dim FirstRow, LastRow, LastColumn, LastResource, nUnique As Long
Dim CurResource, CurrentRow, i, j As Integer
Dim Resource(1000, 2) As Variant
Dim Rng, r As Range
'
' INITIALIZATIONS
'
' These are index numbers for the Resource array
'
Const RName = 0
Const TotHours = 1
Const TotPercent = 2
'
' Set the maximum number of resources we'll
' process.
'
Const ResourceLimit = 1000
'
' We are counting on there being no unintended data
' in the spreadsheet.
'
' It won't matter if the cells are empty though. It just
' may take longer to run the macro.
' But if there is data where this macro does not expect it,
' assume unpredictable results.
'
' There are some hardcoded values used.
' This macro just happens to expect the names to be in Column C (or 3).
'
' Get the last row in the spreadsheet:
'
LastRow = Cells.Find(What:="*", _
After:=Range("C1"), _
LookAt:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
'
' Furthermore, this macro banks on the first actual name to be in C6.
' so if the last row is row 65, the range we'll work with
' will evaluate to "C6:C65"
'
FirstRow = 6
Rng = "C" & FirstRow & ":C" & LastRow
Set r = Range(Rng)
'
' Initialize the resource array to be empty (even though we don't really
' need to but I'm old school).
'
For CurResource = 0 To ResourceLimit
Resource(CurResource, RName) = ""
Resource(CurResource, TotHours) = 0
Resource(CurResource, TotPercent) = 0
Next CurResource
'
' Start the resource counter at 0. The counter will represent the number of
' unique entries.
'
nUnique = 0
'
' LET'S GO
'
' Loop from the first relative row and the last relative row
' to process all the cells in the spreadsheet we are interested in
'
For i = 1 To LastRow - FirstRow
'
' Loop here for all unique entries. For any
' new unique entry, that array element will be
' initialized in the second if statement.
'
IsUnique = True
For j = 1 To nUnique
'
' If the current row element has a resource name and is already
' in the resource array, then accumulate the totals for that
' Resource Name. We then have to set IsUnique to false and
' exit the for loop to make sure we don't populate
' a new array element in the next if statement.
'
If r.Cells(i, 1).Value = Resource(j, RName) Then
IsUnique = False
Resource(j, TotHours) = Resource(j, TotHours) + _
r.Cells(i, 4).Value
Resource(j, TotPercent) = Resource(j, TotPercent) + _
r.Cells(i,5).Value
Exit For
End If
Next j
'
' If the resource name is unique then copy the initial
' values we find into the next resource array element.
' I ignore any null cells. (If the cell has a blank you might
' want to add a Trim to the cell). Not much error checking for
' the numerical values either.
'
If ((IsUnique) And (r.Cells(i, 1).Value <> "")) Then
nUnique = nUnique + 1
Resource(nUnique, RName) = r.Cells(i, 1).Value
Resource(nUnique, TotHours) = Resource(nUnique, TotHours) + _
r.Cells(i, 4).Value
Resource(nUnique, TotPercent) = Resource(nUnique, TotPercent) + _
r.Cells(i, 5).Value
End If
Next i
'
' Done processing all rows
'
' (For readability) Set the last resource counter to the last value of
' nUnique.
' Set the current row to the first relative row in the range (r=the range).
'
LastResource = nUnique
CurrentRow = 1
'
' Populate the destination cells with the accumulated values for
' each unique resource name.
'
For CurResource = 1 To LastResource
r.Cells(CurrentRow, 7).Value = Resource(CurResource, RName)
r.Cells(CurrentRow, 8).Value = Resource(CurResource, TotHours)
r.Cells(CurrentRow, 9).Value = Resource(CurResource, TotPercent)
CurrentRow = CurrentRow + 1
Next CurResource
End Sub
The VBA script below looks for all unique values from cell B5 all the way down to the very last cell in column B… $B$1048576. Once it is found, they are stored in the array (objDict).
Private Const SHT_MASTER = “MASTER”
Private Const SHT_INST_INDEX = “InstrumentIndex”
Sub UniqueList()
Dim Xyber
Dim objDict As Object
Dim lngRow As Long
Sheets(SHT_MASTER).Activate
Xyber = Application.Transpose(Sheets(SHT_MASTER).Range([b5], Cells(Rows.count, “B”).End(xlUp)))
Sheets(SHT_INST_INDEX).Activate
Set objDict = CreateObject(“Scripting.Dictionary”)
For lngRow = 1 To UBound(Xyber, 1)
If Len(Xyber(lngRow)) > 0 Then objDict(Xyber(lngRow)) = 1
Next
Sheets(SHT_INST_INDEX).Range(“B1:B” & objDict.count) = Application.Transpose(objDict.keys)
End Sub
I have tested and documented with some screenshots of the this solution. Here is the link where you can find it....
http://xybernetics.com/techtalk/excelvba-getarrayofuniquevaluesfromspecificcolumn/
If you don't mind using the Variant data type, then you can use the in-built worksheet function Unique as shown.
sub unique_results_to_array()
dim rng_data as Range
set rng_data = activesheet.range("A1:A10") 'enter the range of data here
dim my_arr() as Variant
my_arr = WorksheetFunction.Unique(rng_data)
first_val = my_arr(1,1)
second_val = my_arr(2,1)
third_val = my_arr(3,1) 'etc...
end sub
If you are not interested in the count function, then you could simplify the dictionary approach by using empty quotes for the dictionary value instead of the counter. The following code assumes the first cell containing data is "A1". Alternatively, you could use the Selection (though I understand that is generally frowned upon) or the sheet's UsedRange attribute as your range.
Both of the following examples assume that you want to omit blank values from your array of unique values.
Note that to utilize dictionary objects as follows, you must have the Microsoft Scripting Runtime library active in your references. Also note that by declaring dict as a New Dictionary instead of a Dictionary in the beginning, you can forgo the step of setting it equal to a Scripting Dictionary later. Also, dictionary keys must be unique, and this method does not result in errors when setting the value corresponding to a given dictionary key, so there is no risk of having unique keys.
Sub GetUniqueValuesInRange()
Dim cll As Range
Dim rng As Range
Dim dict As New Dictionary
Dim vArray As Variant
Set rng = Range("A1").CurrentRegion.Columns(1)
For Each cll In rng.Cells
If Len(cll.Value) > 0 Then
dict(cll.Value) = ""
End If
Next cll
vArray = dict.Keys
End Sub
The prior example is a slower method, as it is generally preferred to move the values into an array in the beginning, so that all calculations can be performed in the memory. The following should work faster for larger data sets:
Sub GetUniqueValuesInRange2()
Dim vFullArray As Variant
Dim var As Variant
Dim dict As New Dictionary
Dim vUniqueArray As Variant
vFullArray = Range("A1").CurrentRegion.Columns(1).Value
For Each var In vFullArray
If Len(var) > 0 Then
dict(var) = ""
End If
Next var
vUniqueArray = dict.Keys
End Sub

Resources