Iterate over all rows and find the empty ones of active sheet VBA - excel

I have an Excel-worksheet with different "sections" separated by an empty row. What I want to do is to simple get the row numbers to work with them. Sadly the code is not executing the For-Loop (No failure, just not entering it) but the rowNumber variable is set properly. Did I miss something on the For-Loop?
Sub Foo()
Dim currentSheet As Worksheet
Set currentSheet = activeSheet
emptyRows = FindAllEmptyRows(currentSheet)
End Sub
Function FindAllEmptyRows(sheet As Worksheet) As Variant
Dim emptyRows() As Variant
Dim i As Long, rowNumber As Long
Dim rowCounter As Integer
rowCounter = 1
rowNumber = sheet.UsedRange.Rows.Count
For i = rowNumber To 1
If Cells(i, 1).End(xlToRight).Column = 16384 And Cells(i, 1) = "" Then
emptyRows(rowCounter) = i
rowCounter = rowCounter + 1
End If
Next
FindAllEmptyRows = emptyRows
End Function

If you want to iterate from last row to first you will need to add Step -1.
emptyRows() needs to be sized to fit the data using ReDim
.Column = 16384 should be changed to .Column = sheet.Columns.Count.
I prefer If WorksheetFunction.CountA(sheet.Rows(i)) = 0 Then
Cells needs to be qualified to sheet: sheet.Cells(i, 1)
Refactored Code
Function FindAllEmptyRows(sheet As Worksheet) As Variant
Dim emptyRows() As Variant
Dim i As Long, rowNumber As Long
Dim rowCounter As Integer
rowNumber = sheet.UsedRange.Rows.Count
For i = rowNumber To 1 Step -1
If sheet.Cells(i, 1).End(xlToRight).Column = sheet.Columns.Count And Cells(i, 1) = "" Then
If rowCounter = 0 Then
ReDim emptyRows(0)
Else
ReDim Preserve emptyRows(rowCounter)
End If
emptyRows(rowCounter) = i
rowCounter = rowCounter + 1
End If
Next
FindAllEmptyRows = emptyRows
End Function
SpecialCells
Range.SpecialCells() can be used to divide a Range into areas of cells that meet certain criteria.
MSDN - Range.SpecialCells Method (Excel)
Returns a Range object that represents all the cells that match the specified type and value
OZ Grid
One of the most beneficial Methods in Excel (in my experience) is the SpecialCells Method. When used, it returns a Range Object that represents only those type of cells we specify. For example, one can use the SpecialCells Method to return a Range Object that only contains formulae. In fact, we can, if we wish, even narrow it down further to have our Range Object (containing only formulae) to return only formulae with errors.
Examining the output of this code should give you a good ideas of how to use SpecialCells.
Sub SpecialFoo()
Dim rArea As Range, rBlanks As Range, rFormulas As Range, rConstants As Range, rUnion As Range
Dim sheet As Worksheet
Set sheet = ActiveSheet
On Error Resume Next
Set rBlanks = sheet.UsedRange.SpecialCells(xlCellTypeBlanks)
On Error GoTo 0
If Not rBlanks Is Nothing Then
For Each rArea In rBlanks.Areas
Debug.Print "rBlanks Areas: "; rArea.Address
Next
End If
On Error Resume Next
Set rFormulas = sheet.UsedRange.SpecialCells(xlCellTypeFormulas)
On Error GoTo 0
If Not rFormulas Is Nothing Then
For Each rArea In rFormulas.Areas
Debug.Print "rFormulas Areas: "; rArea.Address
Next
End If
On Error Resume Next
Set rConstants = sheet.UsedRange.SpecialCells(xlCellTypeConstants)
On Error GoTo 0
If Not rConstants Is Nothing Then
For Each rArea In rConstants.Areas
Debug.Print "rConstants Areas: "; rArea.Address
Next
End If
If Not rFormulas Is Nothing And Not rConstants Is Nothing Then
Set rFormulas = Union(rConstants, rFormulas)
For Each rArea In rFormulas.Areas
Debug.Print "rUnion Areas: "; rArea.Address
Next
End If
End Sub

you have to size emptyRows() before using it
furthermore you could use WorksheetFunction.Count() to check for any value in current row
finally
Function FindAllEmptyRows(sheet As Worksheet) As Variant
Dim emptyRows() As Variant
Dim i As Long, rowNumber As Long, rowCounter As Long
With sheet.UsedRange ' reference passed sheet UsedRange
rowNumber = .Rows.Count
ReDim emptyRows(0 To rowNumber - 1) ' dim the array to the maximum possible size
For i = rowNumber To 1 Step -1 ' step through reference range rows from the last baxkwards to the first
If WorksheetFunction.Count(.Rows(i)) = 0 Then
emptyRows(rowCounter) = i + .Rows(1).Row - 1 ' fill array in current index with current row index
rowCounter = rowCounter + 1 ' update array index
End If
Next
End With
ReDim Preserve emptyRows(0 To rowCounter) ' redim the array according to the actual number of found empty rows
FindAllEmptyRows = emptyRows
End Function
please note that:
emptyRows(rowCounter) = i + .Rows(1).Row - 1
is storing the absolute row index, i.e. the sheet row index, while
emptyRows(rowCounter) = i
would store the relative row index, i.e. the row index withing the UsedRange, which may start from a row different than row 1

Related

how to suppress efficiently all empty rows in selected columns in excel vba?

I have found some Q/A to delete rows with empty cells in a chosen column like here. My need is a bit different, the columns are selected by the user, but this is not important.
EDIT : what is important in my use case is to delete the rows where all the cells are empty for these columns i.e. the selected columns.
The following code is working, but can only process 1,000 lines per minute on my i5. In my use case, the datasheet contains several 100k lines which means hours to process. This is not acceptable. Is there a trick to perfom it quickly please?
Sub DeleteRowsOfEmptyColumn() 'sh As Worksheet, col As String)
Application.ScreenUpdating = False
Dim sh As Excel.Worksheet: Set sh = ActiveWorkbook.ActiveSheet
Dim col As Range: Set col = Selection.EntireColumn
Dim cell
Dim area As Range: Set area = Intersect(sh.UsedRange, col)
For i = area.Rows.Count To 1 Step -1 'For Each row In area.Rows
fKeep = False
For Each cell In area.Rows(i).Cells
If Not IsEmpty(cell) Then
fKeep = True
Exit For
End If
Next cell
If Not fKeep Then
sh.Rows(i).Delete 'rowsToDelete.Add i
End If
Next i
Application.ScreenUpdating = True
End Sub
Example:
Before:
After:
Delete Empty Row Ranges
This is a basic example. Your feedback regarding the efficiency is appreciated.
Option Explicit
Sub DeleteRowsOfEmptyColumn()
Application.ScreenUpdating = False
Dim ws As Worksheet: Set ws = ActiveSheet ' improve
Dim crg As Range: Set crg = Selection.EntireColumn ' Columns Range
Dim srg As Range: Set srg = Intersect(ws.UsedRange, crg) ' Source Range
Dim drg As Range ' Delete Range
Dim arg As Range ' Area Range
Dim rrg As Range ' Row Range
For Each arg In srg.Areas
For Each rrg In arg.Rows
If Application.CountA(rrg) = 0 Then
If drg Is Nothing Then
Set drg = rrg
Else
Set drg = Union(drg, rrg)
End If
End If
Next rrg
Next arg
If Not drg Is Nothing Then drg.Delete
Application.ScreenUpdating = True
MsgBox "Rows deleted.", vbInformation
End Sub
Please, try the next way. It will process selected columns or columns having at least a selected cell. It will delete entire rows of the sheet, for the cases of all selected columns empty rows. The code only selects the rows in discussion. If they are the appropriate ones, on the last code line, Select should be replaced with Delete. It should be very fast, even for larger ranges, iterating only between blank cells range:
Sub DeleteRowsOfEmptyColumnsCells()
Dim sh As Excel.Worksheet: Set sh = ActiveSheet
Dim col As Range: Set col = Selection.EntireColumn
Dim area As Range: Set area = Intersect(sh.UsedRange, col)
Dim firstCol As Long: firstCol = area.Column: Stop
Dim areaV As Range, arr, rngDel As Range, i As Long
On Error Resume Next 'only for the case of no any empty rows existence
Set areaV = area.SpecialCells(xlCellTypeBlanks) 'a range of only empty cells
On Error GoTo 0
arr = getRows(areaV) 'extract all rows and number of columns
For i = 0 To UBound(arr(0)) 'iterate between all existing rows
If Intersect(sh.rows(arr(0)(i)), areaV).cells.count = arr(1) Then
If rngDel Is Nothing Then
Set rngDel = sh.cells(arr(0)(i), firstCol)
Else
Set rngDel = Union(rngDel, sh.cells(arr(0)(i), firstCol))
End If
End If
Next i
If Not rngDel Is Nothing Then rngDel.EntireRow.Select 'if it looks OK, Select should be replaced with Delete
End Sub
Function getRows(rng As Range) As Variant
Dim A As Range, i As Long, countC As Long
Dim arrCol, arrR, k As Long, R As Long, mtchC, mtchR
ReDim arrCol(rng.cells.count): ReDim arrR(rng.cells.count)
For Each A In rng.Areas
For i = 1 To A.Columns.count
For j = 1 To A.rows.count
mtchC = Application.match(A.cells(j, i).Column, arrCol, 0)
mtchR = Application.match(A.cells(j, i).row, arrR, 0)
If IsError(mtchC) Then
arrCol(k) = A.cells(j, i).Column: k = k + 1
End If
If IsError(mtchR) Then
arrR(R) = A.cells(j, i).row: R = R + 1
End If
Next j
Next i
Next A
ReDim Preserve arrR(R - 1)
getRows = Array(arrR, k)
End Function
I am working on similar kind of project. I have chosen to read the data into an array, and then work with the data in the array which improves run time significantly. Here is a copy of the function that I have used to delete / transform the data set:
Option Explicit
Option Base 1
Public Function RemoveRowFromArray(Arr As Variant, Element As String, Col As Long) As Variant
Dim i, j, c, count As Long
Dim TempArr() As Variant
For i = LBound(Arr, 1) To UBound(Arr, 1) ' looping through the columns to get desired value
If Arr(i, Col) = Element Then
count = count + 1 ' Counting the number of Elements in array / matrix
For j = i To (UBound(Arr, 1) - 1) ' Looping from the row where Element is found
For c = LBound(Arr, 2) To UBound(Arr, 2) ' Moving all elements in row 1 row up
Arr(j, c) = Arr(j + 1, c)
Next c
Next j
End If
Next i
' Populating TempArr to delete the last rows
ReDim TempArr((UBound(Arr, 1) - count), UBound(Arr, 2))
For i = LBound(TempArr, 1) To UBound(TempArr, 1)
For j = LBound(TempArr, 2) To UBound(TempArr, 2)
TempArr(i, j) = Arr(i, j)
Next j
Next i
RemoveRowFromArray = TempArr
End Function
I tested this and seems to work perfectly. A few important matters to keep in mind
Option Base 1 - This is important, when you declare an arr in VBA it starts at Index 0, when you read the arr from a data set in Excel [arr = sheet1.Range("A:D")] then the arr starting index is 1, Option Base 1 will ensure that all arr start at Index 1.
The function variables are :
Arr - the array / matrix
Element - the string that you wish to search for (in your case it would be blank)
Col - is the column number in which Element is.

Getting the maximum value of a specific column in a 2d array [duplicate]

I use the code hereunder to calculate max values as described in this post (vba max value of group of values). The code works great but once I have more than 65k lines I get a data type mismatch when trying to pase the array:
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
Could somebody help me to slice the array in chunks. I have tried to get it working myself but without any luck.
Sub FillGroupsMax()
Dim lColumn As Long
Dim sht As Worksheet
Dim groupsArray As Variant 'array with all group infomation
Dim groupsSeen As Variant 'array with group infomation already seen
Application.ScreenUpdating = False 'stop screen updating makes vba perform better
Set sht = ThisWorkbook.Worksheets("import")
Set last = sht.Range("A:A").Find("*", Cells(1, 1), searchdirection:=xlPrevious) 'last cell with value in column A
lColumn = sht.Cells(1, Columns.Count).End(xlToLeft).Column
groupsArray = sht.Range(Cells(1, 1), Cells(last.Row, lColumn))
'collect all the information on the Sheet into an array
'Improves performance by not visiting the sheet
For dRow = 2 To last.Row 'for each of the rows skipping header
'check if group as already been seen
If inArrayValue(Cells(dRow, 1).Value, groupsSeen) > 0 Then
'if it has been seen/calculated attribute value
'Cells(dRow, 4).Value = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
groupsArray(dRow, lColumn) = inArrayValue(Cells(dRow, 1).Value, groupsSeen)
Else
'if it hasn't been seen then find max
'Cells(dRow, 4).Value = getMax(Cells(dRow, 1).Value, groupsArray)
groupsArray(dRow, lColumn) = getMax(Cells(dRow, 1).Value, groupsArray, lColumn)
'array construction from empty
If IsEmpty(groupsSeen) Then
ReDim groupsSeen(0)
'groupsSeen(0) = Array(Cells(dRow, 1).Value, Cells(dRow, 4).Value)
groupsSeen(0) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
'attribute value to array
Else
ReDim Preserve groupsSeen(0 To UBound(groupsSeen) + 1)
groupsSeen(UBound(groupsSeen)) = Array(groupsArray(dRow, 1), groupsArray(dRow, lColumn))
End If
End If
Next
sht.Range(Cells(1, lColumn), Cells(last.Row, lColumn)).Value = Application.Index(groupsArray, , lColumn)
'reactivate Screen updating
Application.ScreenUpdating = True
End Sub
Function getMax(group As String, groupsArray As Variant, lColumn As Long) As Double
'for each in array
For n = 1 To UBound(groupsArray)
'if its the same group the Max we seen so far the record
If groupsArray(n, 1) = group And groupsArray(n, lColumn - 1) > maxSoFar Then
maxSoFar = groupsArray(n, lColumn - 1)
End If
Next
'set function value
getMax = maxSoFar
End Function
Function inArrayValue(group As String, groupsSeen As Variant) As Double
'set function value
inArrayValue = 0
'if array is empty then exit
If IsEmpty(groupsSeen) Then Exit Function
'for each in array
For n = 0 To UBound(groupsSeen)
'if we find the group
If groupsSeen(n)(0) = group Then
'set function value to the Max value already seen
inArrayValue = groupsSeen(n)(1)
'exit function earlier
Exit Function
End If
Next
End Function
You can write a helper function to use instead of Application.Index
Bonus - it will be much faster than using Index (>5x)
Sub Tester()
Dim arr, arrCol
arr = Range("A2:J80000").Value
arrCol = GetColumn(arr, 5) '<< get the fifth column
Range("L2").Resize(UBound(arrCol, 1), 1).Value = arrCol
End Sub
'extract a single column from a 1-based 2-D array
Function GetColumn(arr, colNumber)
Dim arrRet, i As Long
ReDim arrRet(1 To UBound(arr, 1), 1 To 1)
For i = 1 To UBound(arr, 1)
arrRet(i, 1) = arr(i, colNumber)
Next i
GetColumn = arrRet
End Function
EDIT - since QHarr asked about timing here's a basic example
Sub Tester()
Dim arr, arrCol, t, i as long
arr = Range("A2:J80000").Value
t = Timer
For i = 1 to 100
arrCol = GetColumn(arr, 5) '<< get the fifth column
Next i
Debug.print Timer - t '<<# of seconds for execution
End Sub
Below, whilst not as tidy as could be, is a way to process an array in chunks and Index to access a column and write out to the sheet.
I populated two columns (A:B) with data. Both had 132,000 rows, populated incrementally, with values from 1 to 132,000 in each column for my test run.
You can fiddle with cutOff to get the chunk size just below the point where the fail happens.
The code below is simply to demonstrate the principle of looping in batches, upto the set cutoff in each batch, until all rows have been processed.
Option Explicit
Public Sub WriteArrayToSheet()
Dim wb As Workbook
Dim ws As Worksheet
Set wb = ThisWorkbook
Set sht = wb.Worksheets("Sheet1") 'change as appropriate
Dim myArr() 'dynamic array
myArr = sht.Range("A1").CurrentRegion.Value 'you may want a more robust method
Dim cutOff As Long 'the max value - what ever it is before error occurs
cutOff = 1000
Dim totalRows As Long 'total rows in array read in from sheet
totalRows = UBound(myArr, 1)
Dim totalArraysNeeded As Long
'Determine how many lots of cutOff chunks there are in the total number of array rows
totalArraysNeeded = Application.WorksheetFunction.Ceiling(totalRows / cutOff, 1)
Dim rotations As Long 'number of times to loop original array to handle all rows
Dim rowCountTotal As Long
Dim rowCount As Long
Dim tempArr() 'this will hold the chunk of the original array
Dim rowCounter As Long
Dim lastRow As Long
Dim nextRow As Long
Dim i As Long
Dim j As Long
Dim numRows As Long
rotations = 1
Do While rotations < totalArraysNeeded
If rotations < totalArraysNeeded - 1 Then
ReDim tempArr(1 To cutOff, 1 To UBound(myArr, 2)) 'size chunk array
numRows = cutOff
Else
numRows = totalRows - rowCountTotal
ReDim tempArr(1 To numRows, 1 To UBound(myArr, 2)) 'size chunk array
End If
For i = 1 To numRows
rowCount = 1 'rows in this chunk looped
rowCountTotal = rowCountTotal + 1 'rows in original array looped
For j = LBound(myArr, 2) To UBound(myArr, 2)
tempArr(i, j) = myArr(rowCountTotal, j)
Next j
rowCount = rowCount + 1
Next i
With sht
lastRow = .Cells(.Rows.Count, "E").End(xlUp).Row 'Column where I am writing the sliced column out to
End With
If lastRow = 1 Then
nextRow = 1
Else
nextRow = lastRow + 1
End If
sht.Range("E" & nextRow).Resize(UBound(tempArr, 1), 1) = Application.Index(tempArr, , 1) 'write out to sheet
rotations = rotations + 1
Loop
End Sub
As #Tim suggested, the best way to slice a large array is use a loop to copy the column.
Though in your case, most of the processing time is spent on computing the maximum since your code is using a nested loop.
If you want to reduce significantly the processing time, then use a dictionary:
Sub Usage
GetMaxByGroupTo _
sourceGroups := ThisWorkbook.Range("Sheet1!A2:A100"), _
sourceValues := ThisWorkbook.Range("Sheet1!B2:B100"), _
target := ThisWorkbook.Range("Sheet1!C2")
End Sub
Sub GetMaxByGroupTo(sourceGroups As Range, sourceValues As Range, target As Range)
Dim dict As Object, groups(), values(), r As Long, max
Set dict = CreateObject("Scripting.Dictionary")
groups = sourceGroups.Value2
values = sourceValues.Value2
' store the maximum value of each group in a dictionary for an efficient lookup '
For r = Lbound(groups) to Ubound(groups)
max = dict(groups(r, 1))
If VarType(max) And values(r, 1) <= max Then Else dict(groups(r, 1)) = values(r, 1)
Next
' build and copy the result array to the sheet '
For r = Lbound(groups) to Ubound(groups)
values(r, 1) = dict(groups(r, 1))
Next
target.Resize(Ubound(groups), 1).Value2 = values
End Sub

Compare two sheets and highlight unmatched rows using unique ID only

I want to match rows from two different sheets and highlight only in the first column of the unmatched row or better still copy the unmatched rows into a new sheet. The code should compare the rows of the two Sheets and color the new rows in the second sheet. Sheet2 (say Jan 2020) contains more rows than Sheet1 (Dec 2019) as its the recently updated sheet and they both contain rows of over 22k with both having unique ID as the first column.
My below code tries to highlight all the unmatching cells and takes longer time to finish. What I wish is for the code to just color the unmatched in column A (the vb.Red) only(since its the unique ID) while ignoring the rest of the column/cells (vb.Yellow) and or if possible copy the highlighted rows into a new sheet.
Sub RunCompare()
Call compareSheets("Sheet1", "Sheet2") 'compareSheets("2019-01 Database", "2019-02 Database")
End Sub
Sub compareSheets(shtSheet1 As String, shtSheet2 As String)
Dim c As Integer, j As Integer, i As Integer, mydiffs As Integer, cnt1 As Integer, cnt2 As Integer
Dim noexist As Integer
cnt2 = Worksheets("Sheet2").Cells.SpecialCells(xlCellTypeLastCell).Row
cnt1 = Worksheets("Sheet1").Cells.SpecialCells(xlCellTypeLastCell).Row
'For each cell in sheet2 that is not the same in Sheet1, color it yellow
For i = 1 To cnt2
For j = 1 To cnt1
If ActiveWorkbook.Worksheets(shtSheet2).Cells(i, 1).Value = ActiveWorkbook.Worksheets(shtSheet1).Cells(j, 1).Value Then
For c = 2 To 22
If Not ActiveWorkbook.Worksheets(shtSheet2).Cells(i, c).Value = ActiveWorkbook.Worksheets(shtSheet1).Cells(j, c).Value Then
ActiveWorkbook.Worksheets(shtSheet2).Cells(i, c).Interior.Color = vbYellow
mydiffs = mydiffs + 1
End If
Next
Exit For
End If
If j = cnt1 Then
ActiveWorkbook.Worksheets(shtSheet2).Cells(i, 1).Interior.Color = vbRed
End If
Next
Next
'Display a message box to demonstrate the differences and if there is a new entry on the second sheet
'MsgBox mydiffs & ":differences found, " & noexist & ":no exist", vbInformation
ActiveWorkbook.Sheets(shtSheet2).Select
End Sub
Let's simplify the task and do it step by step.
This is how the input in the two sheets can look like:
Then, we may consider reading these and saving them to an array:
Set rangeA = ThisWorkbook.Worksheets(1).Range("A1:Z1")
Set rangeB = ThisWorkbook.Worksheets(2).Range("A1:ZZ1")
Dim arrayA As Variant
Dim arrayB As Variant
With Application
arrayA = .Transpose(.Transpose(rangeA))
arrayB = .Transpose(.Transpose(rangeB))
End With
Looping between the data in the two arrays is quite fast in vba. The writing to the third worksheet is done only once the two values from the two arrays match:
Dim myValA As Variant
Dim myValB As Variant
Dim currentRow As Long: currentRow = 1
For Each myValA In arrayA
For Each myValB In arrayB
If myValA = myValB Then
ThisWorkbook.Worksheets(3).Cells(currentRow, 1) = myValA
currentRow = currentRow + 1
End If
Next
Next
This is the result in the third worksheet, all matching values are in a single row:
This is how the whole code looks like:
Sub CompareTwoRanges()
Dim rangeA As Range
Dim rangeB As Range
Set rangeA = ThisWorkbook.Worksheets(1).Range("A1:Z1")
Set rangeB = ThisWorkbook.Worksheets(2).Range("A1:ZZ1")
Dim arrayA As Variant
Dim arrayB As Variant
With Application
arrayA = .Transpose(.Transpose(rangeA))
arrayB = .Transpose(.Transpose(rangeB))
End With
Dim myValA As Variant
Dim myValB As Variant
Dim currentRow As Long: currentRow = 1
For Each myValA In arrayA
For Each myValB In arrayB
If myValA = myValB Then
ThisWorkbook.Worksheets(3).Cells(currentRow, 1) = myValA
currentRow = currentRow + 1
End If
Next
Next
End Sub
Note - there will be another performance bonus, if the results are written to an array and then written from the array to the worksheet. Thus the writing would happen only once. This is the change, that needs to be implemented in the code, after the array declarations:
Dim myValA As Variant
Dim myValB As Variant
Dim resultArray() As Variant
ReDim Preserve resultArray(2 ^ 20)
Dim i As Long: i = 0
For Each myValA In arrayA
For Each myValB In arrayB
If myValA = myValB Then
resultArray(i) = myValA
i = i + 1
End If
Next
Next
ReDim Preserve resultArray(i)
ThisWorkbook.Worksheets(3).Cells(1, 1).Resize(UBound(resultArray)) = Application.Transpose(resultArray)
when you get cell value, it spends time.
so, you can target Range transfer 2d Variant
Dim endRow AS Long
Dim olderRange AS Range
Dim olderVariant AS Variant
endRow = olderSheet.cells(rows.count,1).end(xlup).row
Set olderRange = olderSheet.Range(olderSheet.Cells(startRow, startCol), olderSheet.Cells(endRow, endCol))
'Transfer
olderVariant = olderRange
For currentRow = 1 to UBound(olderVariant, 1)
'Loop
'if you want change real Cell value Or interior
'add row Or Col weight
if olderVariant(currentRow, currentCol) = newerVariant(currentRow, currentCol) THen
newerSheet.Cells(currentRow+10,currentCol+10).interior.colorIndex = 3
End if
Next currentRow
In case anyone has the same kind of problem, I have found an easier way to do it. Providing your sheet2 is the comparison sheet:
Dim Ary1 As Variant, Ary2 As Variant
Dim r As Long
Ary1 = Sheets("Sheet1").UsedRange.Value2
Ary2 = Sheets("Sheet2").UsedRange.Value2
With CreateObject("scripting.dictionary")
For r = 1 To UBound(Ary1)
.Item(Ary1(r, 1)) = Empty
Next r
For r = 1 To UBound(Ary2)
If Not .Exists(Ary2(r, 1)) Then Sheets("Sheet2").Cells(r, 1).Interior.Color = vbRed
Next r
End With

Extract unique distinct list from two columns

So I am trying to create a combined list from two separate columns by omitting the duplicate items. I have searched and found a formula that combines the list this way by going through one column at a time.
But I want to combine the columns like this:
where it goes through each row first.
Is there a formula or VBA code that does that? Thank you.
EDIT: This is just a way to show my request. The color was added to show how the combined list is sorted, it is not part of the request. The actual lists are each about 500 rows long consisting of 9+ digit ID numbers.
This will put the unique words in the order you want.
Sub foo()
Dim rng As Range
Dim ws As Worksheet
Dim i&, j&, t&
Dim dict As Object
Dim iArr() As Variant
Dim oarr() As Variant
Dim itm As Variant
Set dict = CreateObject("Scripting.Dictionary")
Set ws = ActiveSheet
With ws
Set rng = .Range("A:B").Find("*", .Range("A1"), , , , xlPrevious)
If Not rng Is Nothing Then
iArr = .Range(.Cells(2, 1), .Cells(rng.Row, 2)).Value
For i = LBound(iArr, 1) To UBound(iArr, 1)
For j = LBound(iArr, 2) To UBound(iArr, 2)
If iArr(i, j) <> "" Then
On Error Resume Next
dict.Add iArr(i, j), iArr(i, j)
On Error GoTo 0
End If
Next j
Next i
End If
'If your dataset is not that large <30,000, then you can use it directly with transpose
.Range("C2").Resize(dict.Count) = Application.Transpose(dict.items)
'If your data is large then you will want to put it in a one dimensional array first
'just uncomment the below and comment the one line above
' ReDim oarr(1 To dict.Count, 1 To 1)
' t = 1
' For Each itm In dict.keys
' oarr(t, 1) = dict(itm)
' t = t + 1
' Next itm
' Range("C2").Resize(dict.Count) = oarr
End With
End Sub
UDF solution. Using your provided sample data, put this formula in cell I2 and copy down =UnqList(ROW(I1),$G$2:$H$6) or =UnqList(ROW(I1),$G$2:$G$6,$H$2:$H$6) (it can be either because the two or more lists might not be next to each other and the UDF accounts for that)
Public Function UnqList(ByVal lIndex As Long, ParamArray rLists() As Variant) As Variant
Dim i As Long, j As Long
Dim vList As Variant
Dim cUnq As Collection
Dim lMaxRow As Long, lMaxCol As Long
If lIndex <= 0 Then
UnqList = CVErr(xlErrRef)
Exit Function
End If
For Each vList In rLists
If TypeName(vList) <> "Range" Then
UnqList = CVErr(xlErrRef)
Exit Function
Else
If vList.Rows.Count > lMaxRow Then lMaxRow = vList.Rows.Count
If vList.Columns.Count > lMaxCol Then lMaxCol = vList.Columns.Count
End If
Next vList
Set cUnq = New Collection
For i = 1 To lMaxRow
For j = 1 To lMaxCol
For Each vList In rLists
If i <= vList.Rows.Count And j <= vList.Columns.Count Then
On Error Resume Next
cUnq.Add vList.Cells(i, j).Value, CStr(vList.Cells(i, j).Value)
On Error GoTo 0
If lIndex = cUnq.Count Then
UnqList = cUnq(cUnq.Count)
Set cUnq = Nothing
Exit Function
End If
End If
Next vList
Next j
Next i
UnqList = CVErr(xlErrRef)
Set cUnq = Nothing
End Function
You can use my Duplicate Master addin available via my profile.
Advantages are that the addin provides options to
ignore capitilisation
ignore whitespace
run RegExp replacements (advanced)
further options for deletinf, highlighting, selecting duplicates etc

Populate unique values into a VBA array from Excel

Can anyone give me VBA code that will take a range (row or column) from an Excel sheet and populate a list/array with the unique values,
i.e.:
table
table
chair
table
stool
stool
stool
chair
when the macro runs would create an array some thing like:
fur[0]=table
fur[1]=chair
fur[2]=stool
Sub GetUniqueAndCount()
Dim d As Object, c As Range, k, tmp As String
Set d = CreateObject("scripting.dictionary")
For Each c In Selection
tmp = Trim(c.Value)
If Len(tmp) > 0 Then d(tmp) = d(tmp) + 1
Next c
For Each k In d.keys
Debug.Print k, d(k)
Next k
End Sub
In this situation I always use code like this (just make sure delimeter you've chosen is not a part of search range)
Dim tmp As String
Dim arr() As String
If Not Selection Is Nothing Then
For Each cell In Selection
If (cell <> "") And (InStr(tmp, cell) = 0) Then
tmp = tmp & cell & "|"
End If
Next cell
End If
If Len(tmp) > 0 Then tmp = Left(tmp, Len(tmp) - 1)
arr = Split(tmp, "|")
Combining the Dictionary approach from Tim with the variant array from Jean_Francois below.
The array you want is in objDict.keys
Sub A_Unique_B()
Dim X
Dim objDict As Object
Dim lngRow As Long
Set objDict = CreateObject("Scripting.Dictionary")
X = Application.Transpose(Range([a1], Cells(Rows.Count, "A").End(xlUp)))
For lngRow = 1 To UBound(X, 1)
objDict(X(lngRow)) = 1
Next
Range("B1:B" & objDict.Count) = Application.Transpose(objDict.keys)
End Sub
This is the old-school way of doing it.
It will execute faster than looping through cells (e.g. For Each cell In Selection) and will be reliable no matter what, as long you have a rectangular selection (i.e. not Ctrl-selecting a bunch of random cells).
Sub FindUnique()
Dim varIn As Variant
Dim varUnique As Variant
Dim iInCol As Long
Dim iInRow As Long
Dim iUnique As Long
Dim nUnique As Long
Dim isUnique As Boolean
varIn = Selection
ReDim varUnique(1 To UBound(varIn, 1) * UBound(varIn, 2))
nUnique = 0
For iInRow = LBound(varIn, 1) To UBound(varIn, 1)
For iInCol = LBound(varIn, 2) To UBound(varIn, 2)
isUnique = True
For iUnique = 1 To nUnique
If varIn(iInRow, iInCol) = varUnique(iUnique) Then
isUnique = False
Exit For
End If
Next iUnique
If isUnique = True Then
nUnique = nUnique + 1
varUnique(nUnique) = varIn(iInRow, iInCol)
End If
Next iInCol
Next iInRow
'// varUnique now contains only the unique values.
'// Trim off the empty elements:
ReDim Preserve varUnique(1 To nUnique)
End Sub
Profiting from the MS Excel 365 function UNIQUE()
In order to enrich the valid solutions above:
Sub ExampleCall()
Dim rng As Range: Set rng = Sheet1.Range("A2:A11") ' << change to your sheet's Code(Name)
Dim a: a = rng
a = getUniques(a)
arrInfo a
End Sub
Function getUniques(a, Optional ZeroBased As Boolean = True)
Dim tmp: tmp = Application.Transpose(WorksheetFunction.Unique(a))
If ZeroBased Then ReDim Preserve tmp(0 To UBound(tmp) - 1)
getUniques = tmp
End Function
OK I did it finally:
Sub CountUniqueRecords()
Dim Array() as variant, UniqueArray() as variant, UniqueNo as Integer,
Dim i as integer, j as integer, k as integer
Redim UnquiArray(1)
k= Upbound(array)
For i = 1 To k
For j = 1 To UniqueNo + 1
If Array(i) = UniqueArray(j) Then GoTo Nx
Next j
UniqueNo = UniqueNo + 1
ReDim Preserve UniqueArray(UniqueNo + 1)
UniqueArray(UniqueNo) = Array(i)
Nx:
Next i
MsgBox UniqueNo
End Sub
one more way ...
Sub get_unique()
Dim unique_string As String
lr = Sheets("data").Cells(Sheets("data").Rows.Count, 1).End(xlUp).Row
Set range1 = Sheets("data").Range("A2:A" & lr)
For Each cel In range1
If Not InStr(output, cel.Value) > 0 Then
unique_string = unique_string & cel.Value & ","
End If
Next
End Sub
This VBA function returns an array of distinct values when passed either a range or a 2D array source
It defaults to processing the first column of the source, but you can optionally choose another column.
I wrote a LinkedIn article about it.
Function DistinctVals(a, Optional col = 1)
Dim i&, v: v = a
With CreateObject("Scripting.Dictionary")
For i = 1 To UBound(v): .Item(v(i, col)) = 1: Next
DistinctVals = Application.Transpose(.Keys)
End With
End Function
The old school method was my favourite option. Thank you. And it was indeed fast. But I didn't use redim. Here though is my real world example where I accumulate values for each unique "key" found in a column and move it into a array (say for an employee and values are hours worked per day). Then I put each key with its final values into a totals area on the active sheet. I've commented extensively for anyone who wants painful detail on what is happening here. Limited error checking is done by this code.
Sub GetActualTotals()
'
' GetActualTotals Macro
'
' This macro accumulates values for each unique employee from the active
' spreadsheet.
'
' History
' October 2016 - Version 1
'
' Invocation
' I created a button labeled "Get Totals" on the Active Sheet that invokes
' this macro.
'
Dim ResourceName As String
Dim TotalHours As Double
Dim TotalPercent As Double
Dim IsUnique As Boolean
Dim FirstRow, LastRow, LastColumn, LastResource, nUnique As Long
Dim CurResource, CurrentRow, i, j As Integer
Dim Resource(1000, 2) As Variant
Dim Rng, r As Range
'
' INITIALIZATIONS
'
' These are index numbers for the Resource array
'
Const RName = 0
Const TotHours = 1
Const TotPercent = 2
'
' Set the maximum number of resources we'll
' process.
'
Const ResourceLimit = 1000
'
' We are counting on there being no unintended data
' in the spreadsheet.
'
' It won't matter if the cells are empty though. It just
' may take longer to run the macro.
' But if there is data where this macro does not expect it,
' assume unpredictable results.
'
' There are some hardcoded values used.
' This macro just happens to expect the names to be in Column C (or 3).
'
' Get the last row in the spreadsheet:
'
LastRow = Cells.Find(What:="*", _
After:=Range("C1"), _
LookAt:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
'
' Furthermore, this macro banks on the first actual name to be in C6.
' so if the last row is row 65, the range we'll work with
' will evaluate to "C6:C65"
'
FirstRow = 6
Rng = "C" & FirstRow & ":C" & LastRow
Set r = Range(Rng)
'
' Initialize the resource array to be empty (even though we don't really
' need to but I'm old school).
'
For CurResource = 0 To ResourceLimit
Resource(CurResource, RName) = ""
Resource(CurResource, TotHours) = 0
Resource(CurResource, TotPercent) = 0
Next CurResource
'
' Start the resource counter at 0. The counter will represent the number of
' unique entries.
'
nUnique = 0
'
' LET'S GO
'
' Loop from the first relative row and the last relative row
' to process all the cells in the spreadsheet we are interested in
'
For i = 1 To LastRow - FirstRow
'
' Loop here for all unique entries. For any
' new unique entry, that array element will be
' initialized in the second if statement.
'
IsUnique = True
For j = 1 To nUnique
'
' If the current row element has a resource name and is already
' in the resource array, then accumulate the totals for that
' Resource Name. We then have to set IsUnique to false and
' exit the for loop to make sure we don't populate
' a new array element in the next if statement.
'
If r.Cells(i, 1).Value = Resource(j, RName) Then
IsUnique = False
Resource(j, TotHours) = Resource(j, TotHours) + _
r.Cells(i, 4).Value
Resource(j, TotPercent) = Resource(j, TotPercent) + _
r.Cells(i,5).Value
Exit For
End If
Next j
'
' If the resource name is unique then copy the initial
' values we find into the next resource array element.
' I ignore any null cells. (If the cell has a blank you might
' want to add a Trim to the cell). Not much error checking for
' the numerical values either.
'
If ((IsUnique) And (r.Cells(i, 1).Value <> "")) Then
nUnique = nUnique + 1
Resource(nUnique, RName) = r.Cells(i, 1).Value
Resource(nUnique, TotHours) = Resource(nUnique, TotHours) + _
r.Cells(i, 4).Value
Resource(nUnique, TotPercent) = Resource(nUnique, TotPercent) + _
r.Cells(i, 5).Value
End If
Next i
'
' Done processing all rows
'
' (For readability) Set the last resource counter to the last value of
' nUnique.
' Set the current row to the first relative row in the range (r=the range).
'
LastResource = nUnique
CurrentRow = 1
'
' Populate the destination cells with the accumulated values for
' each unique resource name.
'
For CurResource = 1 To LastResource
r.Cells(CurrentRow, 7).Value = Resource(CurResource, RName)
r.Cells(CurrentRow, 8).Value = Resource(CurResource, TotHours)
r.Cells(CurrentRow, 9).Value = Resource(CurResource, TotPercent)
CurrentRow = CurrentRow + 1
Next CurResource
End Sub
The VBA script below looks for all unique values from cell B5 all the way down to the very last cell in column B… $B$1048576. Once it is found, they are stored in the array (objDict).
Private Const SHT_MASTER = “MASTER”
Private Const SHT_INST_INDEX = “InstrumentIndex”
Sub UniqueList()
Dim Xyber
Dim objDict As Object
Dim lngRow As Long
Sheets(SHT_MASTER).Activate
Xyber = Application.Transpose(Sheets(SHT_MASTER).Range([b5], Cells(Rows.count, “B”).End(xlUp)))
Sheets(SHT_INST_INDEX).Activate
Set objDict = CreateObject(“Scripting.Dictionary”)
For lngRow = 1 To UBound(Xyber, 1)
If Len(Xyber(lngRow)) > 0 Then objDict(Xyber(lngRow)) = 1
Next
Sheets(SHT_INST_INDEX).Range(“B1:B” & objDict.count) = Application.Transpose(objDict.keys)
End Sub
I have tested and documented with some screenshots of the this solution. Here is the link where you can find it....
http://xybernetics.com/techtalk/excelvba-getarrayofuniquevaluesfromspecificcolumn/
If you don't mind using the Variant data type, then you can use the in-built worksheet function Unique as shown.
sub unique_results_to_array()
dim rng_data as Range
set rng_data = activesheet.range("A1:A10") 'enter the range of data here
dim my_arr() as Variant
my_arr = WorksheetFunction.Unique(rng_data)
first_val = my_arr(1,1)
second_val = my_arr(2,1)
third_val = my_arr(3,1) 'etc...
end sub
If you are not interested in the count function, then you could simplify the dictionary approach by using empty quotes for the dictionary value instead of the counter. The following code assumes the first cell containing data is "A1". Alternatively, you could use the Selection (though I understand that is generally frowned upon) or the sheet's UsedRange attribute as your range.
Both of the following examples assume that you want to omit blank values from your array of unique values.
Note that to utilize dictionary objects as follows, you must have the Microsoft Scripting Runtime library active in your references. Also note that by declaring dict as a New Dictionary instead of a Dictionary in the beginning, you can forgo the step of setting it equal to a Scripting Dictionary later. Also, dictionary keys must be unique, and this method does not result in errors when setting the value corresponding to a given dictionary key, so there is no risk of having unique keys.
Sub GetUniqueValuesInRange()
Dim cll As Range
Dim rng As Range
Dim dict As New Dictionary
Dim vArray As Variant
Set rng = Range("A1").CurrentRegion.Columns(1)
For Each cll In rng.Cells
If Len(cll.Value) > 0 Then
dict(cll.Value) = ""
End If
Next cll
vArray = dict.Keys
End Sub
The prior example is a slower method, as it is generally preferred to move the values into an array in the beginning, so that all calculations can be performed in the memory. The following should work faster for larger data sets:
Sub GetUniqueValuesInRange2()
Dim vFullArray As Variant
Dim var As Variant
Dim dict As New Dictionary
Dim vUniqueArray As Variant
vFullArray = Range("A1").CurrentRegion.Columns(1).Value
For Each var In vFullArray
If Len(var) > 0 Then
dict(var) = ""
End If
Next var
vUniqueArray = dict.Keys
End Sub

Resources