Comparing Data from Column A and Column B Once - excel

I am currently in the process of analysing data from Excel, and would like to make comparisons between data in Column A and Column B, identifying duplicate data. I am using the following code:
Sub Find_Matches()
Dim CompareRange As Variant, x As Variant, y As Variant
Set CompareRange = Range("c2", Range("c2").End(xlUp))
For Each x In Selection
For Each y In CompareRange
If x = y Then x.Offset(0, 1) = x
Next y
Next x
End Sub
This code has been taken from MSDN, so if it finds a match in Column C against Column A, it will display the matched number in Column B. For the most part it does what I need. However I am looking to modify this code so it only matches a number in the list once.
Example of what the code currently does:
A2 B2 C2
1 1 1
1 1 2
1 1 3
So essentially, because the number 1 appears once in Column C, Column A keeps finding a match.
What I would like it to do is:
A2 B2 C2
1 1 1
1 2
1 3
So because the number 1 only appears in Column C once, it should only be matched once against the numbers in Column A.
I'm assuming this is probably something simple, but I can't seem to determine the logic. Could someone point me in the right direction please?

Testing for duplicates can be simple or complicated depending on how fast you want your procedure to be and how large the data sets are.
I personally favour the Collection object because it has a unique key and testing for the existence of that key is very fast, especially if the dataset is large. The unique test is done by seeing if the code throws an error when you interrogate the Collection for a particular key. Some are philosophically opposed to testing for errors - I have to say that I'm one, so I actually prefer the Dictionary object but for a task this mundane, I won't go through the steps to reference that.
You'll also see that the code below works with arrays rather than cells on the worksheet itself - again, just a matter of personal taste because it's quicker.
Const SOURCE_COL As String = "A"
Const SOURCE_START_ROW As Long = 2
Const COMPARE_COL As String = "C"
Const COMPARE_START_ROW As Long = 2
Const OUTPUT_COL As String = "B"
Dim ws As Worksheet
Dim sourceValues As Variant
Dim compareValues As Variant
Dim outputValues() As Variant
Dim sourceIndex As Long
Dim compareIndex As Long
Dim uniques As Collection
Dim val As Variant
Dim key As String
Dim exists As Variant
Set ws = ThisWorkbook.Worksheets("Sheet1")
sourceValues = ws.Range(ws.Cells(SOURCE_START_ROW, SOURCE_COL), _
ws.Cells(Rows.Count, SOURCE_COL).End(xlUp)).Value2
compareValues = ws.Range(ws.Cells(COMPARE_START_ROW, COMPARE_COL), _
ws.Cells(Rows.Count, COMPARE_COL).End(xlUp)).Value2
Set uniques = New Collection
ReDim outputValues(1 To UBound(sourceValues, 1), 1 To 1)
For sourceIndex = 1 To UBound(sourceValues, 1)
val = sourceValues(sourceIndex, 1)
key = CStr(val)
exists = Empty
On Error Resume Next
exists = uniques(key)
On Error GoTo 0
If IsEmpty(exists) Then
For compareIndex = 1 To UBound(compareValues, 1)
If val = compareValues(compareIndex, 1) Then
outputValues(sourceIndex, 1) = val
uniques.Add val, key
Exit For
End If
Next
End If
Next
ws.Cells(SOURCE_START_ROW, OUTPUT_COL).Resize(UBound(outputValues, 1)).Value = outputValues

Related

Assign a variable to cells to compare mutliple numbers

I have a data set where I need to compare the first number in each transect against each other.
For example, in the below data set I need to compare cells D2, D7, D12 and D17 and assign a value based on which one is the smallest number, the next smallest and so on. This will be used to assign the transect numbers in column A.
My issue is that the number of sections (in this example 4) and the number of transects (also 4 in this example) will vary. So the cells I need to compare will change.
I have written the code that calculates the number of transects, which is:
Dim tlength As Worksheet
Dim tb As Long *'tb=transect break*
Sub tlength_start_stop_coords()
Dim ws As Worksheet
Set ws = ThisWorkbook.Worksheets("sheet1")
Set tlength = ThisWorkbook.Worksheets("transect lengths") *' assigns the sheet to a variable
for efficient coding*
tb = 0 *'counter to calculate the number of transects*
j = 2 *'counter for row*
Lastrow = Lastrow + 1 *'add a row to last row so that the last row includes a blank line.*
*'the following for loop is used to calculate the number of transects*
For i = 2 To Lastrow
If tlength.Range("D" & i) = vbNullString Then
If tlength.Range("D" & i - 1) <> vbNullString Then
tb = tb + 1 *'updates the counter*
End If
End If
Next i
tbtotal = tb *'stores the total counter in variable tbtotal for later use*
I think I may need to use a loop. But I am stuck trying to figure out how to manage the unknown number of comparisons in changing cell locations.
The desired result is in the below screenshot of the expected outcome, with results in column A. To begin with, I only need to get the number for the first row of each transect. Once I have that, I can copy using xldown.
Expected outcome:
Another possible data set and outcome expected might be:
enter image description here
with an expected outcome of:
enter image description here
Worked for me using your second set of sample data:
Sub Tester()
Dim tlength As Worksheet, i As Long, tNum As Long, v, vPrev, arr
Dim col As New Collection, e, c As Range, rng As Range
Set tlength = ThisWorkbook.Worksheets("transect lengths")
'collect all the Section 1 Latitudes and row numbers
For i = 2 To tlength.Cells(Rows.Count, "B").End(xlUp).Row
If tlength.Cells(i, "B") = 1 Then
col.Add Array(i, tlength.Cells(i, "D").Value) 'store start row and first Latitude
End If
Next i
SortCollection col, 2 'sort collection by second element in each array
tNum = 0
'loop over the sorted collection and assign the order of the transects
For Each e In col
tNum = tNum + 1
Set c = tlength.Cells(e(0), "B")
'following assumes all transects have at least 2 sections...
tlength.Range(c, c.End(xlDown)).Offset(0, -1).Value = tNum
Next e
End Sub
'Sort a collection of 0-based arrays in ascending order by the n'th element of each array
'Adapted from https://stackoverflow.com/a/3588073/478884
Sub SortCollection(col As Collection, n As Long)
Dim i As Long, j As Long, vTemp As Variant
For i = 1 To col.Count - 1 'Two loops to bubble sort
For j = i + 1 To col.Count
If col(i)(n - 1) < col(j)(n - 1) Then 'change to > for ascending sort
vTemp = col(j) 'store the lesser item
col.Remove j 'remove the lesser item
col.Add Item:=vTemp, before:=i 're-add the lesser item before the greater Item
End If
Next j
Next i
End Sub

How to count the duplicate value in output in new worksheet using VBA Module

I need to count the duplicate value in new worksheet from the source worksheet "Rawdata" using VBA module. The Number of times the same values are repeated in particular column in front of ID in different sheet.
I have done by Pivot Table but I would like to do this by VBA module on button click event.
Original Source Page
Expected Output worksheet as below :
I have tried with the below code by updating the correct sheet name but I am getting the type mismatch error as below :
I am using the below code but I have manually copy the whole column data from position "Y" (25) to number One ("A") by inserting new column.
modified worksheet
Option Explicit
Sub CountDuplicates()
Dim a As Variant, b As Variant, e As Variant
Dim d As Object
Dim i As Long
Set d = CreateObject("Scripting.Dictionary")
d.CompareMode = 1
b = Sheets("Rawdata").Range("A3").CurrentRegion.Resize(, 1).Value
ReDim Preserve b(1 To UBound(b), 1 To 2)
a = Sheets("Rawdata").UsedRange.Value
For Each e In a
If Len(e) > 0 Then
d(e) = d(e) + 1
End If
Next e
For i = 1 To UBound(b)
b(i, 2) = d(b(i, 1))
Next i
Sheets("DuplicateCount").Range("A1:B1").Resize(UBound(b)).Value = b
End Sub
**Now my Output result worksheet have repeat count (not distinct). I mean ID column is not unique. The numeric value have duplicated with count in result in spite of change the format to text in source column **
This is how it accomplished. Change the sheet names according to yours.
Option Explicit
Sub CountDuplicates()
Dim a As Variant, b As Variant, e As Variant
Dim d As Object
Dim i As Long
Set d = CreateObject("Scripting.Dictionary")
d.CompareMode = 1
b = Sheets("Rawdata").Range("Y3:Y10000").Value
ReDim Preserve b(1 To UBound(b), 1 To 2)
a = Sheets("Rawdata").UsedRange.Value
For Each e In a
If Len(e) > 0 Then
d(e) = d(e) + 1
End If
Next e
For i = 1 To UBound(b)
b(i, 2) = d(b(i, 1))
Next i
Sheets("DuplicateCount").Range("A1:B1").Resize(UBound(b)).Value = b
End Sub

Indexing by ROW on a Column and finding partial matches in a string

I am having issues developing a for loop that incorporates a partial match feature.
To breakdown the problem:
I have two sheets that are going to be compared - one exists on sheet1 column b and the other sheet2 column c.
The for loop will iterate through column B on Sheet1 and then extract the current string at each row - this current string is then passed I tried vlookup and compared to the ENTIRE column C on sheet 2 to find a match: if there is a match it will then return the ADJACENT column to the right of Column C and then Deposit this value to the ADJACENT Column to the right of Column B.
I have currently tried implementing a for if statement that iterates through Column b and if the current string of Column B is equal to a Vlookup of Column C for the match on Current string then return the value.
Sub JoinGroupOnPN()
Dim PartGroupSheet As Worksheet
Dim OEEPartSheet As Worksheet
Dim OEERowRange As Long
Dim OEEColumnRange As Long
Dim PGRowRange As Long
Dim PGColumnRange As Long
Dim OEEArray As Variant
Dim PGArray As Variant
Dim i As Long, j As Long
Set PartGroupSheet = ActiveWorkbook.Worksheets("PartGroup")
Set OEEPartSheet = ActiveWorkbook.Worksheets("OEE Report")
OEERowRange = OEEPartSheet.Cells(Rows.Count, 1).End(xlUp).Row
OEEColumnRange = OEEPartSheet.Cells(1,Columns.Count).End(xlToLeft).Row
PGRowRange = PartGroupSheet.Cells(Columns.Count, 1).End(xlUp).Row
PGColumnRange = PartGroupSheet.Cells(1,Columns.Count).End(xlToLeft).Row
ReDim OEEArray(OEERowRange, OEEColumnRange)
ReDim PGArray(PGRowRange, PGColumnRange)
Dim StringToMatch As String
Dim MatchingString As String
For i = 2 To OEERowRange
StringToMatch = OEEPartSheet.Cells(i, 1).Text
MatchingString = Application.WorksheetFunction.VLookup(Arg1:=StringToMatch, Arg2:=PartGroupSheet.Range(Cell1:=2, Cell2:=1), Arg3:=2, Arg4:=True)
For j = 2 To PGRowRange
If StringToMatch = MatchingString Then
Debug.Print StringToMatch
End If
Next j
Next i
End Sub
I keep getting an error that says the range object failed and I have tried converting it to a range type but still same error.
The Error happens at
MatchingString = Application.WorksheetFunction.VLookup(Arg1:=StringToMatch, Arg2:=PartGroupSheet.Range(Cell1:=2, Cell2:=1), Arg3:=2, Arg4:=True)
and the error message is Method 'Range' of 'Object'_WorkSheet Failed
I cant post any pictures yet
Sheet 1
Sheet 2
Any help would be appreciated thanks !!
Something like this should work:
Sub JoinGroupOnPN()
Dim PartGroupSheet As Worksheet
Dim v, c As Range, rngSrch As Range
Set PartGroupSheet = ActiveWorkbook.Worksheets("PartGroup")
With ActiveWorkbook.Worksheets("OEE Report")
Set rngSrch = .Range(.Range("B2"), .Cells(.Rows.Count, "B").End(xlUp))
End With
For Each c In rngSrch.Cells
If Len(c.Value) > 0 Then
'do not use WorksheetFunction here
v = Application.VLookup(c.Value, PartGroupSheet.Range("B:C"), 2, False)
c.Offset(0, 1).Value = IIf(IsError(v), "No match", v)
End If
Next c
End Sub

Trying to create a key for spatially related records using a VBA loop

So what I have on my heads is an excel file with several hundred thousand rows created by a programmer with no concept of analysis / relational databases. This file essentially contains three separate tables (A, B, and C), each with their own set of fields, all mashed into one gargantuan worksheet.
Rather than assigning a key, all records for each customer are grouped together by position, and the table each row belongs to is tagged in the "Table" column as either A, B, or C. Each record will basically look like :
Table Field 1 Field 2 .......
A X X
B X X
B X X
B X X
C X X
There is always a max of one A and one C per record. There is not a row for A for every record, but there is always a C. A new record always begins after each C row. VB might not be the best choice for the task but I just want to get an interim solution for the time being. This has been my attempt at the code so far (table indicator is in column B, I want to write ID to column O) :
Sub CreateID()
Dim rng As Range
Dim cell As Range
Dim lastcol As Range
Dim firstcol As Range
Dim ID As Integer
ID = 1
Set rng = Range("B4:O100")
Set firstcol = Range("B4:B100")
Set lastcol = Range("O4:O100")
For Each cell In rng
For Each c In firstcol
With lastcol
lastcol.Value = ID
End With
If c.Value = "C" Then
ID = ID + 1
End If
Next c
Next cell
End Sub
As you can tell if you know VBA better than I do, what this is doing is setting the value of every cell in column o to ID, and continues to change the value of every cell each time ID increments by one. Additionally, the loop does not seem to end ( I ctrl+break'd after ID shot up over 1000).
Back to answer my own question as promised. I am very new to VBA so I'm sure there's a more elegant / faster running way to do this and would still be interested to see what that is.
Sub CreateID()
Dim rng As Range
Dim cell As Range
Dim firstcol As Range
Dim ID As Integer
ID = 1
counter = 1
Set rng = Range("B4:O100")
Set firstcol = Range("B4:B100")
For Each c In firstcol
rng(counter, 14) = ID
counter = counter + 1
If c = "C" Then
ID = ID + 1
End If
Next c
End Sub

Copy unique list based on criteria in another column vba excel

I want to use VBA to extract a unique, ordered list, subject to conditions in another column. So, I have two columns A, B.
A B
========
a FALSE
b FALSE
c TRUE
a TRUE
b FALSE
c TRUE
Should result in a list
C
==
a
c
I'm very, very new to VBA, so any help would be appreciated.
Oh, and the second list will be updated with every change to the first, so needs to be wiped to ensure there are no leftovers, if, for example, the second "a" is set to FALSE.
Here's a formula-only approach. Whether it's practical depends on your circumstances, but I have tested it with a sample of data similar to the one in the original question:
Insert a blank row at the top of the spreadsheet to serve as a header row.
Create a new formula column that will list the elements of column "A" only if column "B" is true. For example, place the following formula in cell D2, then copy it down: =IF(B2,A2,"").
Now you can apply the technique described in the second page linked by t.thielemans above.
One potential disadvantage of this approach is that the blank cells returned by the formula when column B is "FALSE" don't disappear--you'll still have a blank result in your filtered view.
I'll copy the reference here for convenience:
Getting unique values in Excel by using formulas only
What do you think of this?
Add the MS Scripting library first.
Option Explicit
Sub Test()
Dim oRange As Range
Dim dict As Dictionary
Dim vArray As Variant
Dim vItem As Variant
Dim sKey As String
Dim sValue As String
Dim iCompare_TRUE As Integer
Dim lCnt As Long
Dim lCnt_Rows As Long
Set dict = New Dictionary
Set oRange = ThisWorkbook.Sheets(1).Range("A1:B6")
For lCnt = 1 To oRange.Rows.Count
sKey = oRange(lCnt, 1)
sValue = oRange(lCnt, 2)
iCompare_TRUE = StrComp(sValue, "True")
If Not dict.exists(sKey) And iCompare_TRUE = 0 Then
With dict
.Add sKey, sValue
End With
End If
Next lCnt
ReDim vArray(1 To dict.Count)
vArray = dict.Keys
lCnt_Rows = UBound(vArray) + 1
Set oRange = ThisWorkbook.Sheets(1).Range(Cells(1, 3), Cells(lCnt_Rows, 3))
oRange.Value = Application.Transpose(vArray)
End Sub

Resources