Quicker way to get all unique values of a column in VBA? - excel

Is there a faster way to do this?
Set data = ws.UsedRange
Set unique = CreateObject("Scripting.Dictionary")
On Error Resume Next
For x = 1 To data.Rows.Count
unique.Add data(x, some_column_number).Value, 1
Next x
On Error GoTo 0
At this point unique.keys gets what I need, but the loop itself seems to be very slow for files that have tens of thousands of records (whereas this wouldn't be a problem at all in a language like Python or C++ especially).

Use Excel's AdvancedFilter function to do this.
Using Excels inbuilt C++ is the fastest way with smaller datasets, using the dictionary is faster for larger datasets. For example:
Copy values in Column A and insert the unique values in column B:
Range("A1:A6").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("B1"), Unique:=True
It works with multiple columns too:
Range("A1:B4").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("D1:E1"), Unique:=True
Be careful with multiple columns as it doesn't always work as expected. In those cases I resort to removing duplicates which works by choosing a selection of columns to base uniqueness. Ref: MSDN - Find and remove duplicates
Here I remove duplicate columns based on the third column:
Range("A1:C4").RemoveDuplicates Columns:=3, Header:=xlNo
Here I remove duplicate columns based on the second and third column:
Range("A1:C4").RemoveDuplicates Columns:=Array(2, 3), Header:=xlNo

Loading the values in an array would be much faster:
Dim data(), dict As Object, r As Long
Set dict = CreateObject("Scripting.Dictionary")
data = ActiveSheet.UsedRange.Columns(1).Value
For r = 1 To UBound(data)
dict(data(r, some_column_number)) = Empty
Next
data = WorksheetFunction.Transpose(dict.keys())
You should also consider early binding for the Scripting.Dictionary:
Dim dict As New Scripting.Dictionary ' requires `Microsoft Scripting Runtime` '
Note that using a dictionary is way faster than Range.AdvancedFilter on large data sets.
As a bonus, here's a procedure similare to Range.RemoveDuplicates to remove duplicates from a 2D array:
Public Sub RemoveDuplicates(data, ParamArray columns())
Dim ret(), indexes(), ids(), r As Long, c As Long
Dim dict As New Scripting.Dictionary ' requires `Microsoft Scripting Runtime` '
If VarType(data) And vbArray Then Else Err.Raise 5, , "Argument data is not an array"
ReDim ids(LBound(columns) To UBound(columns))
For r = LBound(data) To UBound(data) ' each row '
For c = LBound(columns) To UBound(columns) ' each column '
ids(c) = data(r, columns(c)) ' build id for the row
Next
dict(Join$(ids, ChrW(-1))) = r ' associate the row index to the id '
Next
indexes = dict.Items()
ReDim ret(LBound(data) To LBound(data) + dict.Count - 1, LBound(data, 2) To UBound(data, 2))
For c = LBound(ret, 2) To UBound(ret, 2) ' each column '
For r = LBound(ret) To UBound(ret) ' each row / unique id '
ret(r, c) = data(indexes(r - 1), c) ' copy the value at index '
Next
Next
data = ret
End Sub

PowerShell is a very powerful and efficient tool. This is cheating a little, but shelling PowerShell via VBA opens up lots of options
The bulk of the code below is simply to save the current sheet as a csv file. The output is another csv file with just the unique values
Sub AnotherWay()
Dim strPath As String
Dim strPath2 As String
Application.DisplayAlerts = False
strPath = "C:\Temp\test.csv"
strPath2 = "C:\Temp\testout.csv"
ActiveWorkbook.SaveAs strPath, xlCSV
x = Shell("powershell.exe $csv = import-csv -Path """ & strPath & """ -Header A | Select-Object -Unique A | Export-Csv """ & strPath2 & """ -NoTypeInformation", 0)
Application.DisplayAlerts = True
End Sub

it's funny because i've had to read these instructions over and over again, but it think i worked out a much faster way to do this:
Set data = ws.UsedRange
dim unique as variant
unique = WorksheetFunction.Unique(data)
And then you can do whatever you want with the unique array such as iterating it:
For i = LBound(unique) To UBound(unique)
Range("Q" & i) = indexes(i, 1)
Next

Try this
Option Explicit
Sub UniqueValues()
Dim ws As Worksheet
Dim uniqueRng As Range
Dim myCol As Long
myCol = 5 '<== set it as per your needs
Set ws = ThisWorkbook.Worksheets("unique") '<== set it as per your needs
Set uniqueRng = GetUniqueValues(ws, myCol)
End Sub
Function GetUniqueValues(ws As Worksheet, col As Long) As Range
Dim firstRow As Long
With ws
.Columns(col).RemoveDuplicates Columns:=Array(1), header:=xlNo
firstRow = 1
If IsEmpty(.Cells(1, col)) Then firstRow = .Cells(1, col).End(xlDown).row
Set GetUniqueValues = Range(.Cells(firstRow, col), .Cells(.Rows.Count, col).End(xlUp))
End With
End Function
it should be quite fast and without the drawback NeepNeepNeep told about

Related

Excel VBA ListBox in User Form Populate data from Sheet Range, add row by row after evaluating for a condition

I am trying to write a VBA code where I want to populate DATA from a worksheet Range A to AQ spanning over multiple Rows. AQ contains Value "Open" or "Closed". I want to get the rows where AQ value is closed. I tried using the AutoFilter. This is working fine to an extent. But I have to use 2 For loops. One for Each Row and another for Each Column to populate Row wise, column by column into the list box
My Code as follows:
Note : Actual contents start from 6th Row where 6 contains the headers and data starts from 7th Row
Dim i As Long
Dim rowRange As Range
Dim AllData(1 To 1000, 1 To 43) As String
lstRecords.ColumnCount = 43
Set shDSR = mydata1.Sheets("DSR")
last_Row = shDSR.Cells(Rows.Count, 1).End(xlUp).Row
shDSR.AutoFilterMode = False
shDSR.Range("A6:AQ" & last_Row).AutoFilter Field:=43, Criteria1:="CLOSED"
Set rng = shDSR.Range("A6:AQ" & last_Row).SpecialCells(xlCellTypeVisible)
Dim filtrRow() As String
Dim rowCnt As Long
'Me.lstRecords.Clear
rowCnt = 0
If rng.Count > 0 Then
Me.lstRecords.Clear
Me.lstRecords.ColumnCount = rng.Columns.Count
For Each Row In rng.Rows
Me.lstRecords.AddItem
rowCnt = rowCnt +1
filterRow = Range(Row.Address)
'Me.lstRecords.List() = filterRow ''This throws error Type Mismatch so not using
For i = 1 To Row.Columns.Count
AllData(rowCnt, i) = Row.Cells(1, i).Value ''Move to Array
Me.lstRecords.List(rowCnt - 1, i - 1) = filterRow(1, i)'Buggy error when i = 11
Next
Next
'' Following segment works. Add data to Array and then populate ListBox from Array
Me.lstRecords.List() = AllData
Else
MsgBox "No data matches the filter criteria."
End If
Above Code has both approaches
a) Trying to load directly from excel Range (actually using filterRow, but can also directly use range with same issue). But, this approach stops always when i=11 with Invalid property error. I tried changing the data contents etc still same issue
Another Issue when Not taking the array based approach, only one line is added, so in affect only last line is available in the list box
b) Using the AllData array. I load all the row data (matching criteria) into the array and finally populate the listbox from array. THIS WORKS. But I do not like this approach
Can some one please point out where it is going wrong.
Thanks in advance
Problem is that filters create a non contiguous range consisting of areas which you have to iterate separately.
Option Explicit
Sub demo()
Dim mydata1 As Workbook, shDSR As Worksheet
Dim rng As Range, a As Range, r As Range
Dim last_row As Long, n As Long
Dim i As Long, rowCnt As Long
Dim ListData() As String
' change this
Set mydata1 = ThisWorkbook
Set shDSR = mydata1.Sheets("DSR")
With shDSR
.AutoFilterMode = False
last_row = .Cells(.Rows.Count, "AQ").End(xlUp).Row
.Range("A6:AQ" & last_row).AutoFilter Field:=43, Criteria1:="CLOSED"
Set rng = .Range("A6:AQ" & last_row).SpecialCells(xlCellTypeVisible)
.AutoFilterMode = False
End With
' clear listbox
With Me.lstRecords
.Clear
.ColumnCount = rng.Columns.Count
End With
'iterate areas and rows to count visible rows
For Each a In rng.Areas
n = n + a.Rows.Count
Next
rowCnt = 0
If n > 1 Then
' size array
ReDim ListData(1 To n, 1 To rng.Columns.Count)
' fill array
For Each a In rng.Areas
For Each r In a.Rows
rowCnt = rowCnt + 1
For i = 1 To UBound(ListData, 2)
ListData(rowCnt, i) = r.Cells(1, i).Value ''Move to Array
Next
Next
Next
' populate ListBox from Array
Me.lstRecords.List() = ListData
Else
MsgBox "No data matches the filter criteria."
End If
End Sub

VBA Filter Column Store Unique Values of Another Column

I'm trying to accomplish the following.
Filter Column A
Grab Unique values from Column B
So given the following table...
I'd like to filter for "One" in column A and get an array back that I can paste unto column C like this...
I've tried to use dictionaries but I have little understanding of how that works. There can be thousands of rows so speed can be an issue and I'd rather not loop through each if it's not necessary.
I've seen solutions that bring back unique values of a column using advanced filter but never a combination of filtering one column and then using the filtered results to get a unique list of values.
Example of code (partial) I've tried:
On Error Resume Next
enterpriseReportSht.ShowAllData
On Error GoTo 0
With enterpriseReportSht
.AutoFilterMode = False
With .Range(Cells(1, 1).Address, Cells(entRptLR, entRptLC).Address)
.AutoFilter Field:=manLevel2CN, Criteria1:=userInputsArr(i, manLevel2InputCN)
'.SpecialCells(xlCellTypeVisible).Copy Destination:=resultsSht.Range("A1")
End With
End With
filteredColArr = enterpriseReportSht.UsedRange.columns(manLevel4CN).Value
RemoveDuplicatesFromArray (filteredColArr)
with this function:
Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
Dim duplicateFound As Boolean
Dim arrayIndex As Integer, i As Integer, j As Integer
Dim deduplicatedArray() As Variant
arrayIndex = -1
deduplicatedArray = Array(1)
For i = LBound(sourceArray) To UBound(sourceArray)
duplicateFound = False
For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
If sourceArray(i) = deduplicatedArray(j) Then
duplicateFound = True
Exit For
End If
Next j
If duplicateFound = False Then
arrayIndex = arrayIndex + 1
ReDim Preserve deduplicatedArray(arrayIndex)
deduplicatedArray(arrayIndex) = sourceArray(i)
End If
Next i
RemoveDuplicatesFromArray = deduplicatedArray
End Function
My concerns with it is that it's not grabbing the filtered data. It's grabbing all of it I believe. I'm also getting an error with the remove duplicates function.
This should do what you are looking for using a dictionary.
You could speed it up by loading the range into an array and iterating through that, but it's a bit of a pain to do that with a filtered range as well as getting the upperbound of a two dimensional array, you'll need to transpose it into a one dimensional array first. Probably not worth it unless you notice the speed is really slow. I tested with 15k rows it was < 1 second.
Dim i As Long
Dim lr As Long
Dim filterrange As Range
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
With Sheet1 'Change as needed
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
Set filterrange = .Range(.Cells(1, 1), .Cells(lr, 2))
filterrange.AutoFilter 1, "One"
lr = .Cells(.Rows.Count, 1).End(xlUp).Row 'Only really necessary if you have a lot of rows
For i = 1 To lr
If .Rows(i).EntireRow.Hidden = False Then
If Not dict.exists(.Cells(i, 2).Value) Then
dict.Add .Cells(i, 2).Value, ""
End If
End If
Next i
filterrange.AutoFilter
Dim key As Variant
i = 1
For Each key In dict
.Cells(i, 3).Value = key
i = i + 1
Next key
End With
Assuming you have office 365 and no mention of VB only solution, this can achieved using worksheet functions itself
=UNIQUE(FILTER(B:B,A:A=A1))

Excel IF Statement Limited

I am using an IF statement in Excel to search for portions of text in the previous column in order to assign a supplier and category to the expense.
Supplier Column
=IF(ISNUMBER(SEARCH("tit",[#Description])),"TITAN",IF(ISNUMBER(SEARCH("Sol",[#Description])),"Soltrack",IF(ISNUMBER(SEARCH("coin",[#Description])),"Coin",IF(ISNUMBER(SEARCH("gree",[#Description])),"Green Dream Projects",IF(ISNUMBER(SEARCH("sars V",[#Description])),"SARS VAT",IF(ISNUMBER(SEARCH("sars p",[#Description])),"SARS PAYE",IF(ISNUMBER(SEARCH("acb",[#Description])),"Debit Order","")))))))
Category Column
the next column then has the following to get the category of the supplier
=IF(ISNUMBER(SEARCH("TITAN",[#Payee])),"Direct Operating Cost",IF(ISNUMBER(SEARCH("Soltrack",[#Payee])),"Direct Operating Cost",IF(ISNUMBER(SEARCH("Coin",[#Payee])),"Direct Operating Cost",IF(ISNUMBER(SEARCH("Green Dream Projects",[#Payee])),"Direct Operating Cost",IF(ISNUMBER(SEARCH("SARS VAT",[#Payee])),"VAT",IF(ISNUMBER(SEARCH("SARS PAYE",[#Payee])),"PAYE",IF(ISNUMBER(SEARCH("Debit Order",[#Payee])),"Debit Order","")))))))
this is working great, but seems i have reached the limit (7) of IF statements I can use in one formula?
I have created the below function to search for text "tit" and if it matches it updates the Payee column.
'excel if range of cells contains specific text vba
Sub MacroToCheckIfRangeOfCellsContainsSpecificText_vba()
Set Rng = Range("B2:B572") ' You can change this
specificText = "*tit*" ' You can change this
For Each Cell In Rng.Cells
If UCase(Cell.Value) Like "*" & UCase(specificText) & "*" Then
Cell.Offset(0, 1) = "Titan"
Else
Cell.Offset(0, 1) = ""
End If
Next
End Sub
Would I need to create a new specificText = "*tit*" for each of the keywords and also a whole section for each of the "For Each" functions?
Dictionary Solution
The first idea is to use a dictionary Replacements and add all the serach/replace pairs there. This has one huge disadvantage. It is against the good practice to not mix logic (code) and data. Good practice would be to put the data not into the code but into a worksheet (see next solution).
Option Explicit
Public Sub MacroToCheckIfRangeOfCellsContainsSpecificText_vba()
Dim RngToCheck As Range
Set RngToCheck = ThisWorkbook.Worksheets("Sheet1").Range("B2:B572") ' specify in which workbook and worksheet
Dim Replacements As Object
Set Replacements = CreateObject("Scripting.Dictionary")
With Replacements
.Add "tit", "Titan"
.Add "sol", "Soltrack"
'add more here
End With
Dim InputValues() As Variant
InputValues = RngToCheck.Value 'read input values into array
Dim OutputValues() As Variant 'create an output array (same size as RngToCheck)
ReDim OutputValues(1 To RngToCheck.Rows.Count, 1 To 1)
Dim iRow As Long
For iRow = 1 To UBound(OutputValues, 1)
Dim Key As Variant
For Each Key In Replacements.Keys
If UCase(InputValues(iRow, 1)) Like "*" & UCase(Key) & "*" Then
OutputValues(iRow, 1) = Replacements(Key)
Exit For 'we don't need to test for the others if we found a key
End If
Next Key
Next iRow
'write output values from array next to input values in the cells
RngToCheck.Offset(ColumnOffset:=1).Value = OutputValues
End Sub
Worksheet Solution
The better solution would be to create a new worksheet Replacements as below:
This can easily be edited by anyone and you don't need to fiddle with the code later if you want to delete or add pairs.
Public Sub ImprovedCheckUsingWorksheet()
Dim RngToCheck As Range
Set RngToCheck = ThisWorkbook.Worksheets("Sheet1").Range("B2:B572") ' specify in which workbook and worksheet
Dim Replacements() As Variant 'read replacements from worksheet
Replacements = ThisWorkbook.Worksheets("Replacements").Range("A2", ThisWorkbook.Worksheets("Replacements").Cells(Rows.Count, "B").End(xlUp)).Value 'read input values into array
Dim InputValues() As Variant
InputValues = RngToCheck.Value 'read input values into array
Dim OutputValues() As Variant 'create an output array (same size as RngToCheck)
ReDim OutputValues(1 To RngToCheck.Rows.Count, 1 To 1)
Dim iRow As Long
For iRow = 1 To UBound(OutputValues, 1)
Dim rRow As Long
For rRow = 1 To UBound(Replacements, 1)
If UCase(InputValues(iRow, 1)) Like "*" & UCase(Replacements(rRow, 1)) & "*" Then
OutputValues(iRow, 1) = Replacements(rRow, 2)
Exit For 'we don't need to test for the others if we found a key
End If
Next rRow
Next iRow
'write output values from array next to input values in the cells
RngToCheck.Offset(ColumnOffset:=1).Value = OutputValues
End Sub
For a 3ʳᵈ column in your replacements worksheet you would need to adjust the following line to be until column "C":
Replacements = ThisWorkbook.Worksheets("Replacements").Range("A2", ThisWorkbook.Worksheets("Replacements").Cells(Rows.Count, "C").End(xlUp)).Value 'read input values into array
and the output values need another column too (second parameter needs to go 1 To 2):
ReDim OutputValues(1 To RngToCheck.Rows.Count, 1 To UBound(Replacements, 2) - 1) 'this works for any amount of columns as it reads the column count from the `Replacements`
the data needs to be written
OutputValues(iRow, 1) = Replacements(rRow, 2) 'first output column
OutputValues(iRow, 2) = Replacements(rRow, 3) 'second output column
and writing the output values needs to be adjusted too:
RngToCheck.Offset(ColumnOffset:=1).Resize(ColumnSize:=UBound(OutputValues, 2)).Value = OutputValues 'this works for any amount of columns as it reads the column count from `OutputValues`
Formula Solution
But if you have your data in a worksheet Replacements like above, and you don't rely an a partial match. Then you don't need VBA and can easily use a formula instead to look it up:
=IFERROR(INDEX(Replacements!B:B,MATCH(B:B,Replacements!A:A,0)),"")

What is the easiest way to take two columns of data and convert to dictionary?

I have a worksheet with data in columns A and B.
I am looking for a convenient way to take these columns and convert to dictionary where the cell in column A is the key and column B is the value, something like :
Dim dict as Dictionary
Set dict = CreateDictFromColumns("SheetName", "A", "B")
NOTE: I am already referencing the scripting dll.
You would need to loop, E.g.
Function CreateDictFromColumns(sheet As String, keyCol As String, valCol As String) As Dictionary
Set CreateDictFromColumns = New Dictionary
Dim rng As Range: Set rng = Sheets(sheet).Range(keyCol & ":" & valCol)
Dim i As Long
Dim lastCol As Long '// for non-adjacent ("A:ZZ")
lastCol = rng.Columns.Count
For i = 1 To rng.Rows.Count
If (rng(i, 1).Value = "") Then Exit Function
CreateDictFromColumns.Add rng(i, 1).Value, rng(i, lastCol).Value
Next
End Function
This breaks on the first empty key value cell.
I think it'd be best form to pass two ranges to a create dictionary function. This allows for the ranges to be completely separate, even different workbooks. It also allows for a 1D range to be mapped to a 2D range as demonstrated below.
Alternatively, you could also pass two arrays of range values. That may be cleaner for 1D ranges, but would result in slightly more code for 2D mapping. Notice that range elements can be looped through left to right top to bottom by index. You can use Application.Transpose(Range("A1:A5")) to effectively run top to bottom left to right.
Jagged Mapping
Sub Test()
RangeToDict Sheets(1).Range("A1:A5"), Sheets(2).Range("C1:E2")
End Sub
Function RangeToDict(ByVal KeyRng As Range, ByVal ValRng As Range) As Dictionary
Set RangeToDict = New Dictionary
For Each r In KeyRng
vi = vi + 1
'It may not be advisable to handle empty key values this way
'The handling of empty values and #N/A/Error values
'Depends on your exact usage
If r.Value2 <> "" Then
RangeToDict.Add r.Value2, ValRng(vi)
Debug.Print r.Value2 & ", " & ValRng(vi)
End If
Next
End Function
Side-By-Side (As Range)
If your target range is a single 2 column range side by side, you can simplify to passing a single range as shown below. Consequently, this also works for mapping every other element in a 1 dimensional range.
Sub Test()
RangeToDict2 Range("A1:B5")
End Sub
Function RangeToDict2(ByVal R As Range) As Dictionary
Set RangeToDict2 = New Dictionary
i = 1
Do Until i >= (R.Rows.Count * R.Columns.Count)
RangeToDict2.Add R(i), R(i + 1)
Debug.Print R(i) & ", " & R(i + 1)
i = i + 2
Loop
End Function
Two Columns (As Array)
Lastly, as an example of passing arrays as arguments, you could do something like the following. However, the following code will only work given the OP's specific scenario of mapping two columns. As is, it won't handle mapping rows or alternating elements.
Sub Test()
Dim Keys() As Variant: Keys = Range("E1:I1").Value2
Dim Values() As Variant: Values = Range("E3:I3").Value2
RangeToDict Keys, Values
End Sub
Function RangeToDict(Keys() As Variant, Values() As Variant) As Dictionary
Set RangeToDict = New Dictionary
For i = 1 To UBound(Keys)
RangeToDict.Add Keys(i, 1), Values(i, 1)
Debug.Print Keys(i, 1) & ", " & Values(i, 1)
Next
End Function
Use of Named Ranges
It may be convenient to used named ranges, in which case you can pass a Range as an argument likes this...
Sub Test()
RangeToDict Names("Keys").RefersToRange, Names("Values").RefersToRange
End Sub
The best approach to take, is to populate a variant array with the data from the worksheet. You can then loop through the array, assigning the elements of the first array column as the dictionary key; the elements of the second array column can then be used as the value.
The lrow function is used to find the last populated row from column A - allowing the code to create a dynamically sized array and dictionary.
To enable use of dictionaries within VBA, you will need to go to Tools -> References and then enable Microsoft Scripting Runtime.
Sub createDictionary()
Dim dict As Scripting.Dictionary
Dim arrData() As Variant
Dim i as Long
arrData = Range("A1", Cells(lrow(1), 2))
set dict = new Scripting.Dictionary
For i = LBound(arrData, 1) To UBound(arrData, 1)
dict(arrData(i, 1)) = arrData(i, 2)
Next i
End Sub
Function lrow(ByVal colNum As Long) As Long
lrow = Cells(Rows.Count, 1).End(xlUp).Row
End Function
This should do the trick :
Public Function test_leora(SheetName As String, _
KeyColumn As String, _
ValColumn As String) _
As Variant
Dim Dic, _
Val As String, _
Key As String, _
Ws As Worksheet, _
LastRow As Long
Set Ws = ThisWorkbook.Sheets(SheetName)
Set Dic = CreateObject("Scripting.Dictionary")
With Ws
LastRow = .Range(KeyColumn & .Rows.Count).End(xlUp).Row
For i = 1 To LastRow
Val = .Cells(i, ValColumn)
Key = .Cells(i, KeyColumn)
If Dic.exists(Key) Then
Else
Dic.Add Key, Val
End If
Next i
End With
test_leora = Dic
End Function

Compare the Sheet1 column A values with Sheet2 column B, if match then Sheet2.Col C=Sheet1.Col A and Sheet2.Col D=True

I want to compare the Sheet1 column A values with Sheet2 column B, if match then i want to put the Sheet1 Column A values in Sheet2 Column C.
and column D should be populated with 'True'
So i have written the below code:
Sub val()
Dim sheet1_last_rec_cnt As Long
Dim sheet2_last_rec_cnt As Long
Dim sheet1_col1_val As String
Dim cnt1 As Long
Dim cnt2 As Long
sheet1_last_rec_cnt = Sheet1.UsedRange.Rows.Count
sheet2_last_rec_cnt = Sheet2.UsedRange.Rows.Count
For cnt1 = 2 To sheet1_last_rec_cnt
sheet1_col1_val = Sheet1.Range("A" & cnt1).Value
For cnt2 = 2 To sheet2_last_rec_cnt
If sheet1_col1_val = Sheet2.Range("B" & cnt2).Value Then
Sheet2.Range("C" & cnt2).Value = sheet1_col1_val
Sheet2.Range("D" & cnt2).Value = "True"
Exit For
End If
Next
Next
End Sub
Problem is i have one millions of records in both the sheets.
if i use the above code then For loop is running (One million * One million) times. So excel is hanging like anything.
Can someone please help me to optimize the code?
For 1 million records I'm not sure Excel is the best place to be storing this data. If your code is designed to tidy up the data so that you can export it to a database then great ... if not, then, well, I fear rough seas lay ahead for you.
The code below will speed things up a touch as it only loops through each column once, and it populates a collection of unique values so that it only has to check against that instead of the whole column each time. If you sorted your rows then it could be made even quicker but I'll leave that one for you.
Public Sub RunMe()
Dim uniques As Collection
Dim sourceValues As Variant
Dim targetValues As Variant
Dim sourceItem As String
Dim targetItem As String
Dim sourceCount As Long
Dim targetCount As Long
Dim matches As Boolean
Dim output() As Variant
' Acquire the values to be compared.
With ThisWorkbook.Worksheets("Sheet1")
sourceValues = .Range(.Cells(2, "A"), .Cells(.Rows.Count, "A").End(xlUp)).Value2
End With
With ThisWorkbook.Worksheets("Sheet2")
targetValues = .Range(.Cells(2, "B"), .Cells(.Rows.Count, "B").End(xlUp)).Value2
End With
'Resize the output array to size of target values array.
ReDim output(1 To UBound(targetValues, 1), 1 To 2)
sourceCount = 1
Set uniques = New Collection
'Iterate through the target values to find a match in the source values
For targetCount = 1 To UBound(targetValues, 1)
targetItem = CStr(targetValues(targetCount, 1))
matches = Contains(uniques, targetItem)
If Not matches Then
'Continue down the source sheet to check the values.
Do While sourceCount <= UBound(sourceValues, 1)
sourceItem = CStr(sourceValues(sourceCount, 1))
sourceCount = sourceCount + 1
'Add any new values to the collection.
If Not Contains(uniques, sourceItem) Then uniques.Add True, sourceItem
'Check for a match and leave the loop if we found one.
If sourceItem = targetItem Then
matches = True
Exit Do
End If
Loop
End If
'Update the output array if there's a match.
If matches Then
output(targetCount, 1) = targetItem
output(targetCount, 2) = True
End If
Next
'Write output array to the target sheet.
ThisWorkbook.Worksheets("Sheet2").Range("C2").Resize(UBound(targetValues, 1), 2).value = output
End Sub
Private Function Contains(col As Collection, key As String) As Boolean
'Function to test if the key already exists.
Contains = False
On Error Resume Next
Contains = col(key)
On Error GoTo 0
End Function

Resources