fastest way to merge duplicate cells in without looping Excel - excel

I have cells containing duplicate values that i want to merge quickly. The table looks like this:
Sub MergeCells()
Application.DisplayAlerts = False
Dim n As Name
Dim fc As FormatCondition
Dim Rng As Range, R As Range
Dim lRow As Long
Dim I&, J&
Dim arr As Variant
ReDim arr(1 To 1) As Variant
With ThisWorkbook.Sheets("tst")
Set Rng = .Range("A2:D11")
lRow = Rng.End(xlDown).Row
For J = 1 To 4
For I = lRow To 2 Step -1 'last row to 2nd row
If Trim(UCase(.Cells(I, J))) = Trim(UCase(.Cells(I - 1, J))) Then
Set R = .Range(.Cells(I, J), .Cells(I - 1, J))
arr(UBound(arr)) = R.Address
ReDim Preserve arr(1 To UBound(arr) + 1)
End If
Next I
Next J
ReDim Preserve arr(1 To UBound(arr) - 1)
Set R = .Range(Join(arr, ","))
'MsgBox R.Areas.Count
'R.Select
'R.MergeCells = True
With R
.Merge
.HorizontalAlignment = xlCenter
.VerticalAlignment = xlCenter
End With
Stop
End With
Application.DisplayAlerts = True
End Sub
The duplicate cells ranges could be disjointed or non-adjacent cells. I want a way to quickly identify such duplicate ranges and merge them without using a For loop. [Don't know, but think there could be a fastest innovative way without loops probably using some combination of Excel array formulae and VBA code, to select and merge duplicate cell ranges.]
BTW the above code works fine till it shoots up the following error at line .Merge.
EDIT
This is a snapshot of the Watch window showing the arr content as well as R.Address.
OUTPUT:
Don't need any selections, this is just for demonstration purpose:
Output should look like this:
EDIT...
Suppose the duplicate values were same across the rows? So only duplicate columns values to be merged. There has to be an quick, innovative way to do this merge.
Final Output Image:

The issue is that your code can only find 2 adjacent cells and is not looking for a third one with this code: Set R = .Range(.Cells(I, J), .Cells(I - 1, J))
After the first loop it adds these 2 cells
After another loop it adds the next 2 cells
And this results in an overlapping
which you can see at the darker shading of the selection.
I just edited some part of your code with comments, so you can see how it could be done. But I'm sure there is still space for improvements.
Sub MergeCellsNew()
Application.DisplayAlerts = False
Dim n As Name
Dim fc As FormatCondition
Dim Rng As Range, R As Range
Dim lRow As Long
Dim I&, J&
Dim arr As Variant
ReDim arr(1 To 1) As Variant
With ThisWorkbook.Sheets("tst")
Set Rng = .Range("A2:D11")
lRow = Rng.End(xlDown).Row
For J = 1 To 4
I = 2 'I = Rng.Row to automatically start at the first row of Rng
Do While I <= lRow
Set R = .Cells(I, J) 'remember start cell
'run this loop as long as duplicates found next to the start cell
Do While Trim(UCase(.Cells(I, J))) = Trim(UCase(.Cells(I + 1, J)))
Set R = R.Resize(R.Rows.Count + 1) 'and resize R + 1
I = I + 1
Loop
'now if R is bigger than one cell there are duplicates we want to add to the arr
'this way single cells are not added to the arr
If R.Rows.Count > 1 Then
arr(UBound(arr)) = R.Address
ReDim Preserve arr(1 To UBound(arr) + 1)
End If
I = I + 1
Loop
Next J
ReDim Preserve arr(1 To UBound(arr) - 1)
Set R = .Range(Join(arr, ","))
With R
.Merge
.HorizontalAlignment = xlCenter
.VerticalAlignment = xlCenter
End With
Stop
End With
Application.DisplayAlerts = True
End Sub

Related

Remove rows from a 2d array if value in column is empty

I have a large table of lab measurement logs, which I work with using arrays.
(Im a chemist, a lab technician and Ive started to learn VBA only last week, please bear with me.)
Im trying to figure out, how to load the table into an array and then remove rows with an empty value in the 5th column so that I can "export" the table without blanks in the 5th column via an array into a different sheet.
I first tested this with some code I found for a 1D array, where I would make 2 arrays, one placeholder array which Id loop through adding only non-blanks to a second array.
For Counter = LBound(TestArr) To UBound(TestArr)
If TestArr(Counter, 1) <> "" Then
NoBlankSize = NoBlankSize + 1
NoBlanksArr(UBound(NoBlanksArr)) = TestArr(Counter, 1)
ReDim Preserve NoBlanksArr(0 To UBound(NoBlanksArr) + 1)
End If
Next Counter
It works in 1D, but I cant seem to get it two work with 2 dimensions.
Heres the array Im using for reading and outputting the data
Sub ArrayTest()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim TestArray() As Variant
Dim Dimension1 As Long, Dimension2 As Long
Sheets("Tracker").Activate
Dimension1 = Range("A3", Range("A2").End(xlDown)).Cells.Count - 1
Dimension2 = Range("A2", Range("A2").End(xlToRight)).Cells.Count - 1
ReDim TestArray(0 To Dimension1, 0 To Dimension2)
'load into array
For Dimension1 = LBound(TestArray, 1) To UBound(TestArray, 1)
For Dimension2 = LBound(TestArray, 2) To UBound(TestArray, 2)
TestArray(Dimension1, Dimension2) = Range("A4").Offset(Dimension1, Dimension2).Value
Next Dimension2
Next Dimension1
Sheets("Output").Activate
ActiveSheet.Range("A2").Select
'read from array
For Dimension1 = LBound(TestArray, 1) To UBound(TestArray, 1)
For Dimension2 = LBound(TestArray, 2) To UBound(TestArray, 2)
ActiveCell.Offset(Dimension1, Dimension2).Value = TestArray(Dimension1, Dimension2)
Next Dimension2
Next Dimension1
Erase TestArray
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
Thank you for any help in advance.
The Redim Preserve statement does not work for two-dimensional arrays if you want to change the number of records (rows).
You could load the range into an array, and then when you want to export the array to another range, loop through that array while skipping blank records.
An example:
Option Explicit
Sub ArrayTest()
Dim wb As Workbook, wsInput As Worksheet, wsOutput As Worksheet
Dim myArr As Variant
Dim i As Long, k As Long, LRow As Long
Set wb = ThisWorkbook
Set wsInput = wb.Sheets("Tracker")
Set wsOutput = wb.Sheets("Output")
LRow = wsOutput.Cells(wsOutput.Rows.Count, "A").End(xlUp).Row + 1
'Load a range into the array (example range)
myArr = wsInput.Range("A1:Z100")
'Fill another range with the array
For i = LBound(myArr) To UBound(myArr)
'Check if the first field of the current record is empty
If Not Len(myArr(i, 1)) = 0 Then
'Loop through the record and fill the row
For k = LBound(myArr, 2) To UBound(myArr, 2)
wsOutput.Cells(LRow, k) = myArr(i, k)
Next k
LRow = LRow + 1
End If
Next i
End Sub
From your code, it appears you want to
test a column of data on a worksheet to see if there are blanks.
if there are blanks in the particular column, exclude that row
copy the data with the excluded rows to a new area
You can probably do that easier (and quicker) with a filter: code below checking for blanks in column2
Option Explicit
Sub removeCol2BlankRows()
Dim wsSrc As Worksheet, wsRes As Worksheet
Dim rSrc As Range, rRes As Range
Set wsSrc = ThisWorkbook.Worksheets("sheet1")
Set rSrc = wsSrc.Cells(1, 1).CurrentRegion 'many ways to do this
Set wsRes = ThisWorkbook.Worksheets("sheet1")
Set rRes = wsRes.Cells(1, 10)
If wsSrc.AutoFilterMode = True Then wsSrc.AutoFilterMode = False
rSrc.AutoFilter field:=2, Criteria1:="<>"
rSrc.SpecialCells(xlCellTypeVisible).Copy rRes
wsRes.AutoFilterMode = False
End Sub
If you really just want to filter the VBA arrays in code, I'd store the non-blank rows in a dictionary, and then write it back to the new array:
Option Explicit
Sub removeCol2BlankRows()
Dim testArr As Variant
Dim noBlanksArr As Variant
Dim myDict As Object
Dim I As Long, J As Long, V
Dim rwData(1 To 4) As Variant
With ThisWorkbook.Worksheets("sheet1")
testArr = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=4)
End With
Set myDict = CreateObject("Scripting.Dictionary")
For I = 1 To UBound(testArr, 1)
If testArr(I, 2) <> "" Then
For J = 1 To UBound(testArr, 2)
rwData(J) = testArr(I, J)
Next J
myDict.Add Key:=I, Item:=rwData
End If
Next I
ReDim noBlanksArr(1 To myDict.Count, 1 To 4)
I = 0
For Each V In myDict.keys
I = I + 1
For J = 1 To 4
noBlanksArr(I, J) = myDict(V)(J)
Next J
Next V
End Sub

How to Make Cells "Blank," not "Empty" in VBA

I using Labview to generate a Excel report that essentially pastes an array into the spreadsheet. There are gaps in the spreadsheet, for example:
1
2
3
1
2
3
But because I am inserting an array into the spreadsheet, the gaps are empty, but they aren't blank.
When I run vba code checking each cell using "IsEmpty," it returns true. But if I run an excel formula using "ISBLANK," it returns false. I have tried the following, but it doesn't make the cell blank.
If IsEmpty(Cells(r,c)) Then
Cells(r,c).Value = ""
Cells(r,c).ClearContents
Cells(r,c) = ""
I want to make the cells blank without having to delete them. This is because I'm trying to use .End in my VBA code, but it doesn't stop at the gaps.
You don't need to check IsEmpty(), instead:
If Cells(r, c).Value = "" Then Cells.ClearContents
This will remove Nulls. By Nulls, I mean zero-length Strings.
This might be overkill, but it'll work for you:
Sub tgr()
Dim ws As Worksheet
Dim rClear As Range
Dim aData As Variant
Dim lRowStart As Long
Dim lColStart As Long
Dim i As Long, j As Long
Set ws = ActiveWorkbook.ActiveSheet
With ws.UsedRange
If .Cells.Count = 1 Then
ReDim aData(1 To 1, 1 To 1)
aData = .Value
Else
aData = .Value
End If
lRowStart = .Row
lColStart = .Column
End With
For i = LBound(aData, 1) To UBound(aData, 1)
For j = LBound(aData, 2) To UBound(aData, 2)
If Len(Trim(aData(i, j))) = 0 Then
If rClear Is Nothing Then
Set rClear = ws.Cells(lRowStart + i - 1, lColStart + j - 1)
Else
Set rClear = Union(rClear, ws.Cells(lRowStart + i - 1, lColStart + j - 1))
End If
End If
Next j
Next i
If Not rClear Is Nothing Then rClear.ClearContents
End Sub

Delete rows sub very slow to process

I built a macro in Excel that stores input from multiple input tabs into a database (table format). As part of the macro I included a Sub to delete any previous entries for a given year (CYear) before writing new entries for that year.
This was working fine until the size of the workbook increased to about 10MB. The following part of the code now takes >1 hour to run. Is there any other method which might be faster?
Application.ScreenUpdating = False and Application.Calculation = xlCalculationManual are included as part of the larger Sub, r will approach some thousands of rows.
Dim r As Long
Sheets("Database").Activate
For r = ActiveSheet.UsedRange.Rows.Count To 1 Step -1
If Cells(r, "G") = Range("C5") Then
ActiveSheet.Rows(r).EntireRow.Delete
End If
Next
Deleting something in a Worksheet is a rather slow operation, and depending on how many rows you want to delete (and it seems to be a lot), you should collect everything that should be deleted in a Range-Variable and delete it all at once.
One additional aspect is that UsedRange is not always reliable, and if you are unlucky, the macro checks everything from the very last possible row (=1048576), which could also be an issue. The construct .Cells(.Rows.Count, "G").End(xlUp).row will get the row number of the last used row in Col 'G'.
Try the following code
Sub del()
Dim r As Long
Dim deleteRange As Range
Set deleteRange = Nothing
With ThisWorkbook.Sheets(1)
For r = .Cells(.Rows.Count, "G").End(xlUp).row To 1 Step -1
If .Cells(r, "G") = .Range("C5") Then
If deleteRange Is Nothing Then
Set deleteRange = .Cells(r, "G")
Else
Set deleteRange = Union(deleteRange, .Cells(r, "G"))
End If
End If
Next
End With
If Not deleteRange Is Nothing Then
deleteRange.EntireRow.Delete
End If
End Sub
Hey bob I found that when you work with thousands of rows or hundreds of thousands you may want to try arrays. They are insanely fast to do the same as you would on the sheet
Try this:
Sub DeleteRows()
Dim arr, arr1, yeartocheck As Integer, yearchecked As Integer, ws As Worksheet, i As Long, j As Long, x As Long
Set ws = ThisWorkbook.Sheets("DataBase")
yeartocheck = ws.Range("C5")
arr = ws.UsedRange.Value 'the whole sheet allocated on memory
ReDim arr1(1 To UBound(arr), 1 To UBound(arr, 2)) 'lets define another array as big as the first one
For i = 1 To UBound(arr1, 2) 'headers for the final array
arr1(1, i) = arr(1, i)
Next i
x = 2 'here starts the data on the final array (1 is for the headers)
For i = 2 To UBound(arr) 'loop the first array looking to match your condition
yearchecked = arr(i, 7)
If yearchecked <> yeartocheck Then 'if they don't match, the macro will store that row on the final array
For j = 1 To UBound(arr, 2)
arr1(x, j) = arr(i, j)
Next j
x = x + 1 'if we store a new row, we need to up the x
End If
Next i
With ws
.UsedRange.ClearContents 'clear what you have
.Range("A1", .Cells(UBound(arr1), UBound(arr, 2))).Value = arr1 'fill the sheet with all the data without the CYear
End With
End Sub

Countif on sheet with 700k rows freezes program

I currently have two lists. A list of "Grantors" in column A and the same list with duplicates removed in column B. I am trying to get a count of how many times a given Grantor is in Column A using countif however my list in Column A is over 700k rows. I am using 64bit excel but every time I run code to do this excel freezes and crashes.
Is there a way to do this in excel or do I need to take another approach like using a pivot table or creating tables in access?
I have written a few sub routines but this is the latest, got from another post on this forum.
Sub Countif()
Dim lastrow As Long
Dim rRange As Range
Dim B As Long '< dummy variable to represent column B
B = 2
With Application
.ScreenUpdating = False 'speed up processing by turning off screen updating
.DisplayAlerts = False
End With
'set up a range to have formulas applied
With Sheets(2)
lastrow = Cells(Rows.Count, "A").End(xlUp).Row
Set rRange = .Range(.Cells(2, B), .Cells(lastrow, B))
End With
'apply the formula to the range
rRange.Formula = "=COUNTIF($A$2:$A$777363,C2)"
'write back just the value to the range
rRange.Value = rRange.Value
With Application
.ScreenUpdating = True
.DisplayAlerts = True
End With
End Sub
Something like this:
Sub Countif()
Dim allVals, uniqueVals, i As Long, dict, v, dOut(), r As Long
''creating dummy data
' With Sheet2.Range("A2:A700000")
' .Formula = "=""VAL_"" & round(RAND()*340000,0)"
' .Value = .Value
' End With
'
'get the raw data and unique values
With Sheet2
allVals = .Range("A2:A" & .Cells(.Rows.Count, "A").End(xlUp).Row).Value
uniqueVals = .Range("B2:B" & .Cells(.Rows.Count, "B").End(xlUp).Row).Value
End With
ReDim dOut(1 To UBound(uniqueVals, 1), 1 To 1) 'for counts...
Set dict = CreateObject("scripting.dictionary")
'map unique value to index
For i = 1 To UBound(uniqueVals, 1)
v = uniqueVals(i, 1)
If Len(v) > 0 Then dict(v) = i
Next i
'loop over the main list and count each unique value in colB
For i = 1 To UBound(allVals, 1)
v = allVals(i, 1)
If Len(v) > 0 Then
If dict.exists(v) Then
r = dict(v)
dOut(r, 1) = dOut(r, 1) + 1
End If
End If
Next i
'output the counts
Sheet2.Range("C2").Resize(UBound(dOut, 1), 1).Value = dOut
End Sub
Runs in ~30sec with 700k values in A and 300k uniques in B
... or maybe this.
Caution: this overwrites the de-duplicated values in column A of the target worksheet.
Option Explicit
Sub countUnique()
Dim arr As Variant, i As Long, dict As Object
Debug.Print Timer
Set dict = CreateObject("scripting.dictionary")
dict.comparemode = vbTextCompare
With Worksheets("sheet2")
arr = .Range(.Cells(2, "A"), .Cells(.Rows.Count, "A").End(xlUp)).Value2
End With
For i = LBound(arr, 1) To UBound(arr, 1)
dict.Item(arr(i, 1)) = dict.Item(arr(i, 1)) + 1
Next i
With Worksheets("sheet3")
.Cells(2, "A").Resize(dict.Count, 1) = bigTranspose(dict.keys)
.Cells(2, "B").Resize(dict.Count, 1) = bigTranspose(dict.items)
End With
Debug.Print Timer
End Sub
Function bigTranspose(arr1 As Variant)
Dim t As Long
ReDim arr2(LBound(arr1) To UBound(arr1), 1 To 1)
For t = LBound(arr1) To UBound(arr1)
arr2(t, 1) = arr1(t)
Next t
bigTranspose = arr2
End Function
42.64 seconds for 700K originals and 327K uniques on a Surface Pro tablet. This might be improved by turning off calculation and enableevents. Screenupdating really shouldn't be an issue.

how to adjust code for better performance

I am trying to make edge relation from excel file which are organized in rows,
A,B,C,
D,E
the aim is to create relationships from each row:
A,B
A,C
B,C
I have the following codes , the problem is the codes is efficient when rows are equal in length but for example for above rows it create also following edges (relationship):
D," "
E, " "
Which create big problem for large data set. I was wondering if some body can help me to adjust the code the way to create the edge list only till filled cells in each row. If there is any other way to do this more efficient will appreciate it.
Thank you so much,Will be great help.
My code:
Sub Transform()
Dim targetRowNumber As Long
targetRowNumber = Selection.Rows(Selection.Rows.Count).Row + 2
Dim col1 As Variant
Dim cell As Range
Dim colCounter As Long
Dim colCounter2 As Long
Dim sourceRow As Range: For Each sourceRow In Selection.Rows
For colCounter = 1 To Selection.Columns.Count - 1
col1 = sourceRow.Cells(colCounter).Value
For colCounter2 = colCounter + 1 To Selection.Columns.Count
Set cell = sourceRow.Cells(, colCounter2)
If Not cell.Column = Selection.Column Then
Selection.Worksheet.Cells(targetRowNumber, 1) = col1
Selection.Worksheet.Cells(targetRowNumber, 2) = cell.Value
targetRowNumber = targetRowNumber + 1
End If
Next colCounter2
Next colCounter
Next sourceRow
End Sub
I've played around with it - this should do the trick. We can probably speed it up by outputting to another variant array if needed, but this ran pretty quickly for me:
Sub Transform_New()
Dim rngSource As Range, rngDest As Range
Dim varArray As Variant
Dim i As Integer, j As Integer, k As Integer
Set rngSource = Sheet1.Range("A1", Sheet1.Cells(WorksheetFunction.CountA(Sheet1.Columns(1)), 1)) 'Put all used rows into range
Set rngDest = Sheet1.Cells(WorksheetFunction.CountA(Sheet1.Columns(1)), 1).Offset(2, 0) 'Set target range to start 2 below source range
varArray = Range(rngSource, rngSource.Offset(0, Range("A1").SpecialCells(xlCellTypeLastCell).Column)).Value
For i = LBound(varArray, 1) To UBound(varArray, 1) 'Loop vertically through array
For j = LBound(varArray, 2) To UBound(varArray, 2) 'Loop horizontally through each line apart from last cell
k = j
Do Until varArray(i, k) = ""
k = k + 1
If varArray(i, k) <> "" Then
rngDest.Value = varArray(i, j)
rngDest.Offset(0, 1).Value = varArray(i, k)
Set rngDest = rngDest.Offset(1, 0)
End If
Loop
Next
Next
End Sub

Resources