Excel: Duplicate Merging - excel

So I've got an incredibly large contact list, originally an export from Outlook so it follows the same formatting. I've an incredibly large number of duplicates down to records holding the same name, but separate addresses / mobile numbers.
I'm looking for a macro that can help me merge these duplicates so I don't lose things like different addresses under the same name.
http://i.stack.imgur.com/EaI6e.png
In this case I'd love a macro to see that A3 is a duplicate of A2, so take J3 to O3 and paste these in Q2 to V2. Then repeat this process for any duplicate pair found.

This is a bit rudimentary, but it works. Note, since you didn't say what should happen where there are more than 2 duplicates, this macro won't really work if that's the case.
Sub moveDuplicates()
Dim i As Integer, lastRow As Integer
Dim primaryRange As Range, copyToRange As Range, cel As Range, cel2 As Range, rng As Range
Dim ws As Worksheet
Set ws = ActiveSheet
'First, sort by "Key" to get duplicates all in a row
With ws.Sort
.SortFields.Clear
.SortFields.Add Key:=Range( _
"A2:A" & ws.UsedRange.Rows.Count), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:= _
xlSortNormal
.SetRange Range("A1:P" & ws.UsedRange.Rows.Count)
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
lastRow = ws.UsedRange.Rows.Count
Set rng = ws.Range(ws.Cells(2, 1), ws.Cells(lastRow, 1))
Dim lastDupRow As Integer, startDupRow As Integer, copyRow As Integer
For Each cel In rng
'First, see if the cell has a duplicate anywhere, if not, then goto next cel
cel.Select
If WorksheetFunction.CountIf(rng, cel.Value) > 1 Then
Debug.Print "Duplicate exists"
'So, since we know a duplicate exists, we need to copy the duplicate rows' info.
startDupRow = cel.Row + 1
lastDupRow = ws.Columns(1).Find(cel.Value, searchDirection:=xlPrevious).Row
If lastDupRow - startDupRow = 0 Then
copyRow = lastDupRow
Else
End If
For Each cel2 In ws.Range(Cells(startDupRow, 1), ws.Cells(lastDupRow, 1))
' pasteRange(cel.Row).Select
' copyRange(cel2.Row).Select
pasteRange(cel.Row).Value = copyRange(cel2.Row).Value
copyRange(cel2.Row).EntireRow.Delete
Next cel2
End If
Next cel
End Sub
Private Function copyRange(ByVal iRow As Integer) As Range
Set copyRange = Range(Cells(iRow, 10), Cells(iRow, 15))
End Function
Private Function pasteRange(ByVal xRow As Integer) As Range
Set pasteRange = Range(Cells(xRow, 17), Cells(xRow, 22))
End Function

Related

How to compare rows in excel (vba) and insert an other column?

I need something like this.
This picture contains all what I need, I need to insert Automatically the Value column.
I'll start by sorting rows, to help me to affect the same value to the same rows.
And I need to compare row by row (row to the next) if are the same i'll affect them the same value(column value).
Sub SortMultipleColumns()
// I'll start by sorting rows, to help me to affect the same value to the same rows
With ActiveSheet.Sort
.SortFields.Clear
.SortFields.Add Key:=Range("A1"), Order:=xlAscending
.SortFields.Add Key:=Range("B1"), Order:=xlAscending
.SortFields.Add Key:=Range("C1"), Order:=xlAscending
.SortFields.Add Key:=Range("D1"), Order:=xlAscending
.SetRange Range("A1", Range("D1").End(xlDown))
.Header = xlYes
.Apply
End With
Dim bothrows As Range, i As Integer
Set bothrows = Selection
With bothrows
// here i need to compre rows and insert in the last column value start by 1++
For i = 1 To .Rows.Count
If Not StrComp(.Cells(1, i), .Cells(2, i), vbBinaryCompare) = 0 Then
// here I need to do something
End If
Next i
End With
End Sub
Try this code:
Sub SortMultipleColumns()
'Declarations.
Dim RngList As Range
Dim RngRow As Range
Dim RngCell As Range
Dim DblCounter As Double
Dim StrString01 As String
Dim StrString02 As String
'Setting RngList.
Set RngList = Range("A1")
Set RngList = Range(RngList, RngList.End(xlDown).End(xlToRight))
Set RngList = RngList.Resize(RngList.Rows.Count - 1).Offset(1, 0)
'Sorting RngList.
With ActiveSheet.Sort
.SortFields.Clear
For Each RngCell In RngList.Rows(1).Cells
.SortFields.Add Key:=RngCell, Order:=xlAscending
Next
.SetRange RngList
.Header = xlYes
.Apply
End With
'Covering each row of RngList.
For Each RngRow In RngList.Rows
'Setting variables.
StrString01 = ""
StrString02 = ""
'Covering each cell in the given row.
For Each RngCell In RngRow.Cells
'Setting variables to the contents of the given row and to its previous one.
StrString01 = StrString01 & RngCell.Value
StrString02 = StrString02 & RngCell.Offset(-1, 0).Value
Next
'Checking if the two rows differs.
If Not StrComp(StrString01, StrString02, vbBinaryCompare) = 0 Then
DblCounter = DblCounter + 1
End If
'Reporting DblCounter.
RngRow.Offset(0, RngRow.Columns.Count).Resize(1, 1) = DblCounter
Next
End Sub
Since it adapts to the width of the given list, it's meant to be used once on the list. You can just edit the setting of RngList to a list with a fixed number of column if you want to use it multiple times. Like this:
'Setting RngList.
Set RngList = Range("A1")
Set RngList = Range(RngList, RngList.End(xlDown).Offset(0,3)
Set RngList = RngList.Resize(RngList.Rows.Count - 1).Offset(1, 0)
You can also use a formula to obtain the same result. Something like this in cell E2 should do the trick:
=IF(A2&B2&C2&D2=A1&B1&C1&D1,MAX(E$1:E1),MAX(E$1:E1)+1)

Check if record exists and append records to bottom

I have a piece of modified code which I've been using but is very inefficient. The intention was to check if records in 'Database1' sheet exists in 'Log1' if so do nothing if not add the record to first available row. There are multiple iterations of a record in Log1. There should always only be one instance of the record in Database1.
Each time the code runs it replaces all records in Database1.
It seems to be cheking row1 database1 versus row1 Log1 and not the whole range so it copies in multiple entries for one record even though it already exists.
Can anyone help? Apologies if I don't articulate this clearly please ask and I will add more detail if needed.
Option Explicit
Sub Checkrecordthenaddifnotexists()
Dim Ws As Worksheet
Dim i As Long, j As Long
Dim k As Long
Dim sht As Worksheet
Dim LastRow As Long
Dim LastColumn As Long
Dim StartCell  As Range
Dim objTable   As ListObject
Application.Calculation = xlCalculationAutomatic
    Set sht = Worksheets("Database1")
Sheets("Database1").Select
Cells.Select
sht.Sort.SortFields.Clear
sht.Sort.SortFields.Add Key:=Range("A:A"), _
SortOn:=xlSortOnValues, Order:=xlAscending, _
DataOption:=xlSortTextAsNumbers
With ActiveWorkbook.Worksheets("Database1").Sort
.SetRange Range("A:AB")
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
Sheets("Log1").Select
Cells.Select
ActiveWorkbook.Worksheets("Log1").Sort.SortFields.Clear
ActiveWorkbook.Worksheets("Log1").Sort.SortFields.Add Key:=Range("B:B"), _
SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
With ActiveWorkbook.Worksheets("Log1").Sort
.SetRange Range("A:AJ")
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
sht.Activate
Set StartCell = Range("A2")
LastRow = sht.Cells(sht.Rows.Count, StartCell.Column).End(xlUp).Row
LastColumn = sht.Cells(StartCell.Row, sht.Columns.Count).End(xlToLeft).Column
On Error Resume Next
'Sheet2.ShowAllData
Sheet2.Select
Selection.AutoFilter
On Error GoTo 0
sht.Range(StartCell, sht.Cells(LastRow, LastColumn)).Select
Set objTable = ActiveSheet.ListObjects.Add(xlSrcRange, Selection, , xlYes)
With ActiveSheet
.ListObjects(1).Name = "Database_v0.1"
End With
Set Ws = Sheets("Database1")
Dim RowsMaster As Integer, Rows2 As Integer
RowsMaster = Ws.Cells(1048576, 1).End(xlUp).Row
Rows2 = Worksheets("Log1").Cells(1048576, 2).End(xlUp).Row
With Worksheets("Log1")
For i = 2 To Rows2
For j = 2 To RowsMaster + 1
If .Cells(i, 1) = Ws.Cells(j, 1) Then
Exit For
End If
Next j
If j = RowsMaster + 1 Then
RowsMaster = RowsMaster + 1
For k = 2 To 8
Ws.Cells(RowsMaster, k - 1) = .Cells(i, k)
Next
End If
Next i
End With
Sheets("Database1").Activate
ActiveSheet.ListObjects("Database_v0.1").Unlist
Range("A1").Select
Range(Selection, Selection.End(xlToRight)).Select
Range(Selection, Selection.End(xlDown)).Select
ActiveSheet.Range("A1:NT1048576").RemoveDuplicates Columns:=1, Header:=xlYes
Sheets("Database Repository").Columns("A").Select
Selection.NumberFormat = "0"
Sheet2.Select
Selection.AutoFilter
Application.Calculation = xlCalculationAutomatic
End Sub
Well this should help you, the whole explanation is in the code:
Option Explicit
Sub Checkrecordthenaddifnotexists()
Application.Calculation = xlCalculationAutomatic
'Try to declare your variables where you are using them
'You sort 2 times, different sheets but mostly same way so,
'write another procedure with variable and give them as you need
'the procedure below needs:
'sheet to be sorted, which range will be the one to sort, the starting cell
SortMySheet ThisWorkbook.Sheets("Database1"), "A:A", ThisWorkbook.Sheets("Database1").Range("A2")
SortMySheet ThisWorkbook.Sheets("Log1"), "A:A", ThisWorkbook.Sheets("Log1").Range("A2") 'change the starting cell
'Now we will change your approach to use 2 arrays and 1 dictionary
'For that you need to go to tools-References- and then check the Microsoft Scripting Runtime reference
'This is assuming you want to add the new entries from sheet Log1 to DataBase1 when they not exist in the later.
'The arrays:
With ThisWorkbook.Sheets("DataBase1")
Dim arrMaster As Variant: arrMaster = LoadArray(ThisWorkbook.Sheets("Database1"), .Range("A2")) 'change the starting cell
End With
With ThisWorkbook.Sheets("Log1")
Dim arrLog As Variant: arrLog = LoadArray(ThisWorkbook.Sheets("Log1"), .Range("A2")) 'change the starting cell
End With
'The dictionary:
Dim IdDictionary As Dictionary: Set IdDictionary = LoadDictionary(arrMaster)
'Now the hardwork, getting the new items to the sheet Log1
AddNewEntries arrMaster, arrLog, IdDictionary
' the next 6 lines of code are useless, we didn't need to make a table, we are not going to have duplicates
' Sheets("Database1").Activate
' ActiveSheet.ListObjects("Database_v0.1").Unlist
' Range("A1").Select
' Range(Selection, Selection.End(xlToRight)).Select
' Range(Selection, Selection.End(xlDown)).Select
' ActiveSheet.Range("A1:NT1048576").RemoveDuplicates Columns:=1, Header:=xlYes
' the next 4 lines of code I don't get
' Sheets("Database Repository").Columns("A").Select
' Selection.NumberFormat = "0"
' Sheet2.Select
' Selection.AutoFilter
Application.Calculation = xlCalculationAutomatic
End Sub
Private Sub SortMySheet(ws As Worksheet, KeyRange As String, StartCell As Range)
With ws
'Get the last row and column for your whole range
Dim LastRow As Long: LastRow = .Cells(.Rows.Count, StartCell.Column).End(xlUp).Row
Dim LastColumn As Long: LastColumn = .Cells(StartCell.Row, .Columns.Count).End(xlToLeft).Column
'Sort your whole range
.Sort.SortFields.Clear
.Sort.SortFields.Add .Range(KeyRange), xlSortOnValues, xlAscending, xlSortTextAsNumbers
With .Sort
.SetRange ws.Range(StartCell, ws.Cells(LastRow, LastColumn))
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
End With
End Sub
Private Function LoadArray(ws As Worksheet, StartCell As Range) As Variant
With ws
Dim LastRow As Long: LastRow = .Cells(.Rows.Count, StartCell.Column).End(xlUp).Row
Dim LastColumn As Long: LastColumn = .Cells(StartCell.Row, .Columns.Count).End(xlToLeft).Column
LoadArray = .Range(StartCell, .Cells(LastRow, LastColumn))
End With
End Function
Private Function LoadDictionary(arr As Variant) As Dictionary
Set LoadDictionary = New Dictionary
'By default dictionaries are Case sensitive, if you need to check without that then:
'LoadDictionary.CompareMode = TextCompare
'Uncheck the comment from the line above, by default I'll go with case Sensitive
Dim i As Long
For i = 1 To UBound(arr)
If Not LoadDictionary.Exists(arr(i, 1)) Then LoadDictionary.Add arr(i, 1), i
Next i
End Function
Private Sub AddNewEntries(arrMaster As Variant, arrLog As Variant, IdDictionary As Dictionary)
With ThisWorkbook.Sheets("DataBase1")
Dim i As Long, j As Long
Dim LastRow As Long
'Loop through all entries in arrLog
For i = 2 To UBound(arrLog)
'If the entry doesn't exist in the DataBase sheet then
If Not IdDictionary.Exists(arrLog(i, 1)) Then
'Calculate the first free row of data in column A for DataBase1
LastRow = .Cells(.Rows.Count, 1).End(xlUp).Row + 1
'Loop through first to last column in arrLog and paste it to DataBase1
For j = 1 To UBound(arrLog, 2)
.Cells(LastRow, j) = arrLog(i, j)
Next j
End If
Next i
End With
End Sub

Sorting within specified range and loop until done - VBA Excel Marcro

It's me again. I've been trying different alternatives to sort the rows based on Col D per collection.
This is the closest one, yet there are 2 bugs found.
1- Loop and not able to exit when it reaches the last used rows.
It keeps sorting until I press to force quitting
2- It's unable to sort where there is only one SKUs per collection
It sort the next collection as well. Sometimes 3 collections sorted.
e.g. Before Run - Row 9, 29, 32, 35, 45....
Here is my code. What's wrong with my code?
Sub SortingCollectionOnColD
With ActiveSheet.Range("A:A")
Set FindSubtotal = .Find("Subtotal", After:=.Range("A1"), LookIn:=xlValues)
If Not FindSubtotal Is Nothing Then
firstOne = FindSubtotal.Address
Do
With FindSubtotal
Range("A" & FindSubtotal.Row - 1).Select
Set SortRange = Range(Selection, Selection.End(xlUp)).EntireRow
ActiveSheet.Sort.SortFields.Clear
ActiveSheet.Sort.SortFields.Add Key:=Range("C" & FindSubtotal.Row) _
, SortOn:=xlSortOnValues, Order:=xlAscending
With ActiveSheet.Sort
.SetRange SortRange
.Header = xlNo
.Orientation = xlTopToBottom
.Apply
End With
End With
Set FindSubtotal = .FindNext(FindSubtotal)
Loop While Not FindSubtotal Is Nothing And FindSubtotal.Address <> firstOne
End If
End With End Sub
Before the run
Expected Result
After the run. highlighted the major failures
Please test the next code, please. I could not reproduce your sheet and I changed the reference to sheet instead of A:A range, where it looks logic to me, but not knowing very well what you wanted to do, it would not be impossible to not return what exactly you need. Please let me know if/how it fits your need.
Sub LoopSubtotal()
Dim FindSubtotal As Range, sh As Worksheet, firstOne As String
Dim SortRange As Range
Set sh = ActiveSheet
With sh.Range("A:A")
Set FindSubtotal = .Find("Subtotal", After:=.Range("A1"), LookIn:=xlValues)
If Not FindSubtotal Is Nothing Then
firstOne = FindSubtotal.Address
Do
sh.Range("A" & FindSubtotal.row - 1).Select
Set SortRange = Range(Selection, Selection.End(xlUp)).EntireRow
sh.Sort.SortFields.Clear
sh.Sort.SortFields.Add Key:=sh.Range("C" & FindSubtotal.row) _
, SortOn:=xlSortOnValues, Order:=xlAscending
With sh.Sort
.SetRange SortRange
.Header = xlNo
.Orientation = xlTopToBottom
.Apply
End With
Set FindSubtotal = .FindNext(FindSubtotal): Debug.Print FindSubtotal.Address
Loop While Not FindSubtotal Is Nothing And FindSubtotal.Address <> firstOne
End If
End With
End Sub
Try, please, running it line by line, pressing F8 and see what it does.

Sort on font color using vba

I am trying to implement a button that, when pressed sorts an array first alphabetically, then based on font color. The column that I am using to sort has 3 possible values (enrolled, waitlisted, and cancelled). The font color for 'cancelled' is grey. I want to get enrolled at the top of the list, then waitlisted, then cancelled at the bottom. Shouldn't be that difficult, but I can't get the code to work. Here's the code I wrote. Many thanks!
Private Sub btnSort_Click()
Dim SortArray As Range
Dim SortColumn As Range
Set SortArray = Range("A3").CurrentRegion
Set SortColumn = Range(Range("A3").End(xlToRight), Range("A3").End(xlToRight).End(xlDown))
SortArray.Sort Key1:=SortColumn, Header:=xlYes
With SortArray.Sort
.SortFields.Clear
.SortFields.Add Key:=SortColumn
.xlSortOnFontColor
.SortOnValue.Color = RGB(192, 192, 192)
.SortOrder = xlAscending
.Header = xlYes
.Apply
End With
Since there are only 3 values, we use a helper column and then assign values to it. We then sort and then finally delete the helper column.
Let's say, your data looks like this
Try this code. I have commneted the code so you should not have a problem in understanding it.
Sub Sample()
Dim ws As Worksheet
Dim lRow As Long, lCol As Long
Dim rng As Range
Dim ColName As String
'~~> Change this to the relevant sheet
Set ws = Sheet1
With ws
'~~> Insert a helper column in Col A
.Columns(1).Insert Shift:=xlToRight
.Cells(1, 1).Value = "TmpHeader"
'~~> Get Last Row and last Column
'~~> I am assuming that headers are in row 1
lRow = .Range("B" & .Rows.Count).End(xlUp).Row
lCol = .Cells(1, Columns.Count).End(xlToLeft).Column
ColName = Split(Cells(, lCol).Address, "$")(1)
'~~> Insert the formula in Col A
.Range("A2:A" & lRow).Formula = "=IF(RC[1]=""enrolled"",1,IF(RC[1]=""waitlisted"",2,3))"
'~~> Set your range
Set rng = .Range("A1:" & ColName & lRow)
'~~> Sort it
rng.Sort Key1:=.Range("A2"), Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
DataOption1:=xlSortNormal
'~~> Delete the helper column
.Columns(1).Delete
End With
End Sub
When you run the above code, it inserts a helper column and then inserts a formula =IF(B2="enrolled",1,IF(B2="waitlisted",2,3)) What this basically does is assigns a value of 1,2 and 3 based on the value whether it is enrolled, waitlisted or cancelled.
Once the formula is inserted, we sort on Col A in ascending order and then finally delete the helper column.
Figured it out:
ActiveSheet.Range("A3").CurrentRegion.Sort Key1:=Range("I3"), Header:=xlYes
ActiveSheet.Sort.SortFields.Clear
ActiveSheet.Sort.SortFields.Add(Range("I3"), _
xlSortOnFontColor, xlDescending, , _
xlSortNormal).SortOnValue.Color = RGB(192, 192, 192)
With ActiveSheet.Sort
.SetRange Range("A3").CurrentRegion
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With

My sort procedure runs on the first sheet but not on the other

A sorting procedure must be applied to several worksheets with different contents. So I opted for the solution "for each sh in worksheets...".
This sorting procedure runs perfectly for the first selected sheet.
On the second sheet, a message "Run time error 1004 The sort reference is not valid" is displayed on the Apply instruction despite the fact that the different variables display correct values.
Referring to "'1004': "The sort reference is not valid."", I changed "With sh.Sort" to "With sh.Range(startcell, lastcell).Sort" which generates the error "Unable to get the sort property of the range class".
Could a member of the forum help me solve this problem?
In advance, thank you
This sort procedure runs on the first sheet but not on the others.
Sub sortData()
Dim startcell As Range, lastcell As Range
Dim sh As Worksheet
Dim x_Birth As Long, lastcell_Birth As Long
For Each sh In Worksheets
With sh
If Left(sh.Name, 2) = "B_" Then
.Columns(5).Insert
.Cells(1, 5) = "Y_Birth"
lastcell_Birth = sh.Cells(Rows.count, "A").End(xlUp).Row
For x_Birth = 2 To lastcell_Birth
.Cells(x_Birth, 5) = Right(.Cells(x_Birth, 4), 4)
Next
Set startcell = Range(.Cells(1, 1), .Cells(1, 1))
Set lastcell = Range(.Cells(lastcell_Birth, 7), .Cells(lastcell_Birth, 7))
With sh.Sort
.SortFields.Add Key:=sh.Range("F1"), Order:=xlAscending
.SortFields.Add Key:=sh.Range("E1"), Order:=xlAscending
.SetRange Range(startcell, lastcell)
.Header = xlYes
.Apply
End With
sh.Columns("E:E").Select
Selection.Columns.EntireColumn.Delete
End If
End With
Next
End Sub
You could try:
Option Explicit
Sub sortData()
Dim startcell As Range, lastcell As Range
Dim sh As Worksheet
Dim x_Birth As Long, lastcell_Birth As Long
For Each sh In Worksheets
With sh
If Left(.Name, 2) = "B_" Then
.Columns(5).Insert
.Cells(1, 5) = "Y_Birth"
lastcell_Birth = .Cells(Rows.Count, "A").End(xlUp).Row
For x_Birth = 2 To lastcell_Birth
.Cells(x_Birth, 5).Value = Right(.Cells(x_Birth, 4).Value, 4)
Next
Set startcell = .Range(.Cells(1, 1), .Cells(1, 1))
Set lastcell = .Range(.Cells(lastcell_Birth, 7), .Cells(lastcell_Birth, 7))
.Sort.SortFields.Clear
With .Sort
.SortFields.Add Key:=sh.Range("E1"), Order:=xlAscending
.SortFields.Add Key:=sh.Range("F1"), Order:=xlAscending
.SetRange sh.Range(startcell, lastcell)
.Header = xlYes
.Apply
End With
.Columns("E:E").Columns.EntireColumn.Delete
End If
End With
Next
End Sub

Resources