Find Duplicate Entry - excel

I am using Excel 2010.
I have some VBA code which creates a unique key and then looks for duplicate unique key entries. Any duplicates are coloured in red.
I need to automate this a little further. If there is a duplicate unique key, copy the information from the newest entry, and paste it into the line where the original entry is. I then want the newest entry deleted.
The unique key is a concat of the customer name and the date the file was created. There will only ever be at most one duplicate entry per customer and that will be because the date the file was last updated has changed. I need the duplicate concat entry with the newest date to copy the info over the top of the entry with the oldest date on it then delete the original newest date entry. This is because we have other checks that have been completed further along the sheet that we need to keep intact.
Ideally I would like for the message box to still advise how many duplicate entries were found and for the entry to remain coloured red once the copy/paste/delete has taken place to highlight the entry that has been changed.
Private Sub CommandButton1_Click()
'Start of Concatenate Code
Dim i As Integer
Dim r As Range
On Error Resume Next
' Tells Excel to look in column 3 (Column C) for the last one with data in it
lRow = Cells(Rows.Count, 3).End(xlUp).Row
' Tell Excel to focus on cells 4 to 5000
For i = 4 To lRow
' Tell Excel to paste the contents of cell 4 (column D) followed by |
' then the contents of cell 8 (column H) into cell 2 (column B)
Cells(i, 2).Value = Cells(i, 11) & " | " & Cells(i, 7)
Next i
'End of Concatenate Code
'Start of Check for Duplicates code
Dim j As Integer
Dim myCell As Range
Dim myRange As Integer
myRange = Range("A4:A5000").Count
j = 0
' Select the Range
For Each myCell In Range("B4:B5000")
' Check that the cells in the range are not blank
If WorksheetFunction.CountIf(Range("B4:B5000"), myCell.Value) > 1 Then
' Colour the duplicate entries in red
myCell.EntireRow.Interior.ColorIndex = 3
j = j + 1
End If
Next
MsgBox "There are " & j & " duplicates found." & vbCrLf & vbCrLf & _
"Any duplicates have been highlighted in red.", vbInformation + vbOKOnly, _
"Duplicate Entry Checker"
' End of Check for Duplicates code
End Sub
Screenshot of spreadsheet

Thank you #rickmanalexander, I just tried your code (and changed the name of the sheet) but I get a subscript out of range error with the number 9 in the msgbox title. There must be something i have missed but i am not sure what?
Here is the code I used:
Private Sub CommandButton1_Click()
On Error GoTo CleanFail
Dim wrkSht As Worksheet
Set wrkSht = Sheets("Raw Data")
Dim lRow As Long
lRow = wrkSht.Cells(wrkSht.Rows.Count, 3).End(xlUp).Row
Dim arrySheet As Variant
'get the worksheet data into an array
arrySheet = wrkSht.Range("D1:H" & lRow).Value2
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Dim keyValue As Variant
Dim i As Long
Dim rowNum As Long
Dim dupCount As Long
For i = LBound(arrySheet, 1) To UBound(arrySheet, 1)
'a concatenated key consisting of the:
'row number
'customer's name
keyValue = Join(Array(i, arrySheet(i, 1)), "|")
If Not dict.Exists(keyValue) Then
dict(keyValue) = arrySheet(i, 8) 'save the date for this unique key
Else
'if we make it here, then this is a duplicate customer
'for which we want to check the date
'If the current row's date is greater than the previouly saved date, then
'delete the current row
'determine the row umber for the previously saved entry
'place the most recent date in place of the old date
'color it red
'increase the duplicate counter
If arrySheet(i, 8) > dict(keyValue) Then
wrkSht.Rows(i).EntireRow.Delete
rowNum = CLng(Split(keyValue, "|")(0))
wrkSht.Cells(rowNum, "B").Value = CDate(arrySheet(i, 8))
wrkSht.Rows(rowNum).EntireRow.Interior.ColorIndex = 3
dupCount = dupCount = dupCount + 1
End If
End If
'clear variables
keyValue = vbNullString: rowNum = 0
Next i
MsgBox "There were " & dupCount & " duplicates found." & _
vbCrLf & vbCrLf & _
"Any duplicates have been highlighted in red.", _
vbInformation + vbOKOnly, "Duplicate Entry Checker"
CleanExit:
Exit Sub
CleanFail:
MsgBox Err.Description, vbCritical, Err.Number
Resume CleanExit
End Sub

Edit:
OP was getting Error 9 subscript out of range, because I used arrySheet(i, 8) instead of arrySheet(i, 4). I was thinking that I defined the array from the range starting at column A. Simple mistake with an easy fix.
The Dictionary Object is the perfect candidate for duplicate checks, so that is what I went with. The code below is untested, but should work for your needs.
Option Explicit
Private Sub CommandButton1_Click()
On Error GoTo CleanFail
Dim wrkSht As Worksheet
Set wrkSht = Sheets("Raw Data")
Dim lRow As Long
lRow = wrkSht.Cells(wrkSht.Rows.Count, 3).End(xlUp).Row
Dim arrySheet As Variant
'get the worksheet data into an array
arrySheet = wrkSht.Range("D1:H" & lRow).Value2
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Dim keyValue As Variant
Dim i As Long
Dim rowNum As Long
Dim dupCount As Long
For i = LBound(arrySheet, 1) To UBound(arrySheet, 1)
'a concatenated key consisting of the:
'row number
'customer's name
keyValue = Join(Array(i, arrySheet(i, 1)), "|")
If Not dict.Exists(keyValue) Then
dict(keyValue) = arrySheet(i, 4) 'save the date for this unique key
Else
'if we make it here, then this is a duplicate customer
'for which we want to check the date
'If the current row's date is greater than the previouly saved date, then
'delete the current row
'determine the row umber for the previously saved entry
'place the most recent date in place of the old date
'color it red
'increase the duplicate counter
If arrySheet(i,4) > dict(keyValue) Then
wrkSht.Rows(i).EntireRow.Delete
rowNum = CLng(Split(keyValue, "|")(0))
wrkSht.Cells(rowNum, "B").Value = CDate(arrySheet(i, 4))
wrkSht.Rows(rowNum).EntireRow.Interior.ColorIndex = 3
dupCount = dupCount = dupCount + 1
End If
End If
'clear variables
keyValue = vbNullString: rowNum = 0
Next i
MsgBox "There were " & dupCount & " duplicates found." & _
vbCrLf & vbCrLf & _
"Any duplicates have been highlighted in red.", _
vbInformation + vbOKOnly, "Duplicate Entry Checker"
CleanExit:
Exit Sub
CleanFail:
MsgBox Err.Description, vbCritical, Err.Number
Resume CleanExit
End Sub

Related

Excel VBA: What is the best way to sum a column in a dataset with variable amounts of lines?

I need to sum two columns (B and C) in a dataset. The number of rows with data will vary between 1 and 17. I need to add the sums two rows beneath the last row of data (end result example in image 1).
My code worked beautifully for one dataset, but I am getting an error
Run-time error'6': Overflow
for a different dataset. What am I doing wrong?
'Units total
Windows("Final_Files.xlsb").Activate
Sheets("Revenue Summary").Select
lastrow = Worksheets("Revenue Summary").Cells(Rows.Count, 2).End(xlUp).Row
Dim a As Integer
a = 10000
For i = lastrow To 2 Step by - 1
a = a + Worksheets("Revenue Summary").Cells(i, 2).Value
Next
Worksheets("Revenue Summary").Cells(lastrow + 2, 2).Value = a
Correct End Result
You can try below sub-
Sub SumBC()
Dim sh As Worksheet
Dim lRowB As Long, lRowC As Long
Dim bSum As Double, cSum As Double
Windows("Final_Files.xlsb").Activate
Set sh = Worksheets("Revenue Summary")
lRowB = sh.Cells(sh.Rows.Count, 2).End(xlUp).Row
lRowC = sh.Cells(sh.Rows.Count, 3).End(xlUp).Row
bSum = WorksheetFunction.Sum(sh.Range("B2:B" & lRowB))
cSum = WorksheetFunction.Sum(sh.Range("C2:C" & lRowC))
sh.Cells(lRowB + 2, 2) = bSum
sh.Cells(lRowC + 2, 3) = cSum
sh.Activate
Set sh = Nothing
End Sub
Remember: If you want to run same sub multiple time then you need clear totals otherwise it will add totals again again below of last totals.
Your code is perfect but there is only one error. You have initialized variable 'a' with 10000. Change it to 0.
a = 0
then your code will be perfect.
Add Totals to Multiple Columns
If you're not OP: It is easy to test the code. Open a new workbook and insert a module. Copy the code into the module. Uncomment the Sheet1 line, and outcomment the Revenue Summary line. In worksheet Sheet1 add some numbers in columns 2 and 3 and your ready.
Run only the insertTotals procedure. The calculateSumOfRange is called when needed.
Play with the constants in insertTotals and change the values in the columns. Add text, error values, booleans to see how the code doesn't break.
The issue with Application.Sum or WorksheetFunction.Sum is that it fails when there are error values in the range. That's what the calculateSumOfRange is preventing. If there is an error value, the loop approach is used. If not, then Application.Sum is the result.
You can use the calculateSumOfRange in Excel as a UDF. Just don't include the cell where the formula is and you're OK, e.g. =calculateSumOfRange(A1:B10).
The Code
Option Explicit
Sub insertTotals()
Const FirstRow As Long = 2 ' First Row of Data
Const LastRowCol As Long = 2 ' The column where the Last Row is calculated.
Const TotalsOffset As Long = 2 ' 2 means: 'data - one empty row - totals'
Dim Cols As Variant
Cols = Array(2, 3) ' add more
'With ThisWorkbook.Worksheets("Sheet1")
With Workbooks("Final_Files.xlsb").Worksheets("Revenue Summary")
' Define Last Row ('LastRow') in Last Row Column ('LastRowCol').
Dim LastRow As Long
LastRow = .Cells(.Rows.Count, LastRowCol).End(xlUp).Row
' Define Last Row Column Range ('rng').
Dim rng As Range
Set rng = .Range(.Cells(FirstRow, LastRowCol), _
.Cells(LastRow, LastRowCol))
Dim j As Long
' Validate Columns Array ('Cols').
If LBound(Cols) <= UBound(Cols) Then
' Iterate columns in Columns Array.
For j = LBound(Cols) To UBound(Cols)
' Use 'Offset' to define the current Column Range and write
' its calculated total below it.
.Cells(LastRow + TotalsOffset, Cols(j)).Value = _
calculateSumOfRange(rng.Offset(, Cols(j) - LastRowCol))
Next j
End If
End With
End Sub
Function calculateSumOfRange(SourceRange As Range) _
As Double
' Initialize error handling.
Const ProcName As String = "calculateSumOfRange"
On Error GoTo clearError ' Turn on error trapping.
' Validate Source Range.
If SourceRange Is Nothing Then
GoTo NoRange
End If
' Calculate Sum of Range.
Dim CurrentValue As Variant
CurrentValue = Application.Sum(SourceRange)
Dim Result As Double
If Not IsError(CurrentValue) Then
Result = CurrentValue
Else
Dim Data As Variant
If SourceRange.Rows.Count > 1 Or SourceRange.Columns.Count > 1 Then
Data = SourceRange.Value
Else
ReDim Data(1, 1)
Data(1, 1) = SourceRange.Value
End If
Dim i As Long
Dim j As Long
For i = 1 To UBound(Data, 1)
For j = 1 To UBound(Data, 2)
CurrentValue = Data(i, j)
If IsNumeric(CurrentValue) And _
Not VarType(CurrentValue) = vbBoolean Then
Result = Result + CurrentValue
End If
Next j
Next i
End If
' Write result and exit.
calculateSumOfRange = Result
GoTo ProcExit
' Labels
NoRange:
Debug.Print "'" & ProcName & "': No range (Nothing)."
GoTo ProcExit
clearError:
Debug.Print "'" & ProcName & "': " & vbLf _
& " " & "Run-time error '" & Err.Number & "':" & vbLf _
& " " & Err.Description
On Error GoTo 0 ' Turn off error trapping.
GoTo ProcExit
ProcExit:
End Function
The following code summs up all the rows under "B2" and "C2". Adapt it to your needs.
' Keep a reference to the worksheet
Dim ws as Worksheet
Set ws = Worksheets("Revenue Summary")
' This is how many rows there are.
Dim rowCount as Long
rowCount = ws.Cells(ws.Rows.Count, 2).End(xlUp).Row-1
' This is the summation operation over each column
Dim b as Double, c as Double
b = WorksheerFunction.Sum(ws.Range("B2").Resize(rowCount,1))
c = WorksheerFunction.Sum(ws.Range("C2").Resize(rowCount,1))
' This writes the sum two cells under the last row.
ws.Range("B2").Cells(rowCount+2,1).Value = b
ws.Range("C2").Cells(rowCount+2,1).Value = c

Excel VBA: How do I add text to a blank cell in a specific column then loop to the next blank cell and add text?

I need a macro to add text to blank cells in Column A. The macro needs to skip cells that have text. The macro needs to stop looping at the end of the data set.
I am trying to use an If Else statement, but I think I'm on the wrong track. My current, non-working code is below. Thank you so much - I'm still new to VBA
Sub ElseIfi()
For i = 2 To 100
If Worksheets("RawPayrollDump").Cells(2, 1).Value = "" Then
Worksheets("RawPayrollDump").Cells(2, 1).Value = "Administration"
Else if(not(worksheets("RawPayrollDump").cells(2,1).value="")) then 'go to next cell
End If
Next
End Sub
To find the last row of data, use the End(xlUp) function.
Try this code. It replaces all empty cells in column A with Administration.
Sub ElseIfi()
Set ws = Worksheets("RawPayrollDump")
lastrow = ws.Cells(Rows.Count, 1).End(xlUp).Row ' last data row
For i = 2 To lastrow ' all rows until last data row
If ws.Cells(i, 1).Value = "" Then ' column A, check if blank
ws.Cells(i, 1).Value = "Administration" ' set text
End If
Next
End Sub
There is no need to loop. Please try this code.
Sub FillBlanks()
Dim Rng As Range
With Worksheets("RawPayrollDump")
Set Rng = Range(.Cells(2, "A"), .Cells(.Rows.Count, "A").End(xlUp))
End With
On Error Resume Next
Set Rng = Rng.SpecialCells(xlCellTypeBlanks)
If Err Then
MsgBox "There are no blank cells" & vbCr & _
"in the specified range.", _
vbInformation, "Range " & Rng.Address(0, 0)
Else
Rng.Value = "Administration"
End If
End Sub
Replace Blanks feat. CurrentRegion
Range.CurrentRegion
Since OP asked for "... stop looping at the end of the data set. ",
I've written this CurrentRegion version.
As I understand it, the end of the data set doesn't mean that there
cannot be blank cells below the last cell containing data in column
A.
Use the 1st Sub to test the 2nd, the main Sub (replaceBlanks).
Adjust the constants including the workbook (in the 1st Sub) to fit your needs.
Criteria is declared as Variant to allow other data types not just strings.
The Code
Option Explicit
Sub testReplaceBlanks()
Const wsName As String = "RawPayrollDump"
Const FirstCellAddress As String = "A2"
Const Criteria As Variant = "Administration"
Dim wb As Workbook: Set wb = ThisWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets(wsName)
replaceBlanks ws, FirstCellAddress, Criteria
End Sub
Sub replaceBlanks(Sheet As Worksheet, _
FirstCellAddress As String, _
Criteria As Variant)
' Define column range.
Dim ColumnRange As Range
Set ColumnRange = Intersect(Sheet.Range(FirstCellAddress).CurrentRegion, _
Sheet.Columns(Sheet.Range(FirstCellAddress) _
.Column))
' To remove the possibly included cells above the first cell:
Set ColumnRange = Sheet.Range(Range(FirstCellAddress), _
ColumnRange.Cells(ColumnRange.Cells.Count))
' Note that you can also use the addresses instead of the cell range
' objects in the previous line...
'Set ColumnRange = sheet.Range(FirstCellAddress, _
ColumnRange.Cells(ColumnRange.Cells.Count) _
.Address)
' or a mixture of them.
' Write values from column range to array.
Dim Data As Variant
If ColumnRange.Cells.Count > 1 Then
Data = ColumnRange.Value
Else
ReDim Data(1 To 1, 1 To 1): Data(1, 1) = ColumnRange.Value
End If
' Modify array.
Dim i As Long, k As Long
For i = 1 To UBound(Data)
If IsEmpty(Data(i, 1)) Then Data(i, 1) = Criteria: k = k + 1
Next i
' Write modified array to column range.
' The following line is used when only the first cell is known...
'Sheet.Range(FirstCellAddress).Resize(UBound(Data)).Value = Data
' ...but since the range is known and is the same size as the array,
' the following will do:
ColumnRange.Value = Data
' Inform user.
If k > 0 Then GoSub Success Else GoSub Fail
Exit Sub
' Subroutines
Success:
MsgBox "Wrote '" & Criteria & "' to " & k & " previously " _
& "empty cell(s) in range '" & ColumnRange.Address & "'.", _
vbInformation, "Success"
Return
Fail:
MsgBox "No empty cells in range '" & ColumnRange.Address & "'.", _
vbExclamation, "Nothing Written"
Return
End Sub

Check for a specific column name given a string then highlight values in the column that doesn't match given value in VBA

I need to look for a given column name for example in the picture look if the column name "FileNumber" exists. If it does exist, I want to look in the column to see if the numbers all are a given number (for example it has to be "101"); if incorrect I want to highlight that number (here, highlight "102")
How can I achieve this in VBA?
Sub FindColumns()
Dim rngToSearch As Range
Dim lookToFind As Variant
Dim iCtr As Long
Set rngToSearch = ThisWorkbook.Worksheets("Sheet").Range("A1:C1")
lookToFind = Array("Filename", "FileNumber", "Author") 'add all Column header that you want to check
With rngToSearch
For iCtr = LBound(lookToFind) To UBound(lookToFind)
If WorksheetFunction.CountIf(rngToSearch, lookToFind(iCtr)) > 0 Then ' Check if column is preset or not
MsgBox lookToFind(iCtr) & " Column Found" ' Pop-up msg if column is exist
Else
MsgBox lookToFind(iCtr) & " Column Not Found" ' Pop-up msg if column is Not Found
End If
Next
End With
End Sub
Use Application.WorksheetFunction.Match to find the column number of the name you are looking for. Then do your checkings for the columns.
Here is an example:
Option Explicit
Public Sub ValidateData()
Dim ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet")
Dim ColumnNames() As Variant
ColumnNames = Array("Filename", "FileNumber", "Author") 'add all Column header that you want to check
Dim Headers As Variant 'read all headers into an array
Headers = ws.Range("A1", ws.Cells(1, ws.Columns.Count).End(xlToLeft)).Value
Dim HeaderColumn As Long 'this is the column number where the header was found
Dim ColName As Variant
For Each ColName In ColumnNames 'loop through your list of names
HeaderColumn = 0 'initialize
On Error Resume Next 'next line throws error if it does not match
HeaderColumn = Application.WorksheetFunction.Match(ColName, Headers, 0)
On Error GoTo 0 're-activate error reporting
If HeaderColumn <> 0 Then
'header name was found
MsgBox ColName & " Column found"
'perform different checks on each column
Select Case ColName
Case "FileNumber"
CheckFileNumberColumn ws.Range(ws.Cells(2, HeaderColumn), ws.Cells(ws.Rows.Count, HeaderColumn).End(xlUp))
'Case "Author" 'add other cases as needed
'CheckAuthorColumn ws.Range(ws.Cells(2, HeaderColumn), ws.Cells(ws.Rows.Count, HeaderColumn).End(xlUp))
End Select
Else
'header name was not found
MsgBox ColName & " Column not found"
End If
Next ColName
End Sub
'this is the procedure to check the FileNumber column
Private Sub CheckFileNumberColumn(DataToValidate As Range)
Dim iRow As Long
For iRow = 1 To DataToValidate.Rows.Count
If DataToValidate.Cells(iRow, 1).Value <> 101 Then
DataToValidate.Cells(iRow, 1).Interior.Color = RGB(255, 0, 0)
End If
Next iRow
End Sub

Excel VBA Columns B and C OR Column D required

I have a spreadsheet which is used enterprise-wide. I am trying to put in checks such that certain fields are required. Specifically, columns B (last name) AND C (first name) are required, OR Column D (Organization) is required. However, B, C, and D cannot all three be filled in. If the row has any data at all in it, B and C or D are required.
My idea is to put in a button to run this macro. That I can do.
I've tried many things at this point. I can include the spreadsheet in case anyone can offer any insight. I had a macro that worked on a test sheet, but does not work on this sheet, if that will help at all.
Here is the macro
Sub CheckVal2()
Dim ws As Worksheet
Dim wsCurr As Worksheet
Dim cel As Range
Dim lngLastRow As Long
Dim lngRow As Long
For Each ws In Worksheets
If Left$(ws.Name, 7) = "Current" Then
Set wsCurr = ws
Exit For
End If
Next
With wsCurr
lngLastRow = .Range("B5000").End(xlUp).Row
For lngRow = 2 To lngLastRow
For Each cel In .Range("B" & lngRow & ":E" & lngRow)
If cel = "" Then
MsgBox "First and Last Name or HCO must be populated."
Cancel = True
Exit Sub
End If
If cel <> "" Then
If .Cells(lngRow, "D") = "" Then
If .Cells(lngRow, "B") = "" Or _
.Cells(lngRow, "C") = "" Then
MsgBox "First and Last Name or HCO must be populated."
Cancel = True
Exit Sub
End If
End If
End If
Next
Next
End With
'
End Sub
Once you get past whatever is causing the error trying to access wsCurr (which I suspect is just a case of the worksheet not existing), you should modify your code as follows:
With wsCurr
lngLastRow = .Range("E5000").End(xlUp).Row
For lngRow = 2 To lngLastRow
'First check whether first/last name has been consistently advised
If (.Cells(lngRow, 2) = "") <> _
(.Cells(lngRow, 3) = "") Then
MsgBox "Row " & lngRow & " - First Name and Last Name must both be advised or both be blank"
Cancel = True ' I assume this is a global variable?
Exit Sub
End If
'Now check that last name has not been advised if HCO has been, and vice-versa
If (.Cells(lngRow, 2) = "") = _
(.Cells(lngRow, 4) = "") Then
MsgBox "Row " & lngRow & " - First and Last Name, or HCO, must be populated but not both."
Cancel = True
Exit Sub
End If
Next
End With
This will get around the existing problem with your tests, which (as far as I can tell) aren't allowing for the case where all three columns have been advised.
I also changed the column on which lngLastRow was being set because, if it is set based on column B and the last row(s) of your data only contained values in column C and/or D, those final row(s) would not be being tested.

Excel VBA logic: get range between two cells using loops

Forgive me, as this may be very simple. I am trying to create a VBA macro that quickly gets statistics from raw data and puts them in a table. The raw data comes in this format:
(They will not always be in groups of three)
How would I get the range for all of a category, and then use that same range for Columns B and C to get the statistics I need?
The below code get you the row numbers of each category and assumes there is no break in content on column B, your question was to get the content of columns C:D by category, having these row values will enable you to code to get the content of C:D.
Public Sub Sample()
Dim WkSht As Worksheet
Dim StrCategory As String
Dim LngRow As Long
Dim LngRowStart As Long
Set WkSht = ThisWorkbook.Worksheets("RawData")
'Take note of the category we are one
StrCategory = WkSht.Range("A" & 2).Value
'Take not of the row the category started on
LngRowStart = 2
'Look to the next row
LngRow = 3
'Loop through the data until column B has no value, signifying the end of the dataset
Do Until WkSht.Range("B" & LngRow) = ""
'Go to the next row until we are given a new category or make it to the end of the dataset
Do Until (WkSht.Range("A" & LngRow) <> "") Or (WkSht.Range("B" & LngRow) = "")
LngRow = LngRow + 1
Loop
'Talk in the immediate pane
Debug.Print StrCategory & " is on rows " & LngRowStart & " to " & LngRow - 1
'Get the next values
StrCategory = WkSht.Range("A" & LngRow)
LngRowStart = LngRow
'Move on
LngRow = LngRow + 1
Loop
Set WkSht = Nothing
End Sub
Below is the input data I gave it: -
Below is the output from the code: -
You could use some If statements and pull this all into an array, but it seems more direct to just fill in the blanks
Sub FillColA()
Dim LastRow As Long
LastRow = Application.WorksheetFunction.CountA(Range("B:B"))
Range("A2:A" & LastRow).SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=R[-1]C"
End Sub

Resources