Extract Unique Values From Multiple Columns With Macro - excel

I have a list of codes in A and the image links in B and C.
What i want to do is remove the duplicates and arrange the unique links in a single column and give them a series name with incrementing no code_1 before image link 1 and code_2 before link 2 as shown in the picture.
I am trying this code to delete the duplicates but clueless about how to put the name before the link.
Sub tgr()
Dim wb As Workbook
Dim wsDest As Worksheet
Dim rData As Range
Dim rArea As Range
Dim aData As Variant
Dim i As Long, j As Long
Dim hUnq As Object
'Prompt to select range. Uniques will be extracted from the range selected.
'Can select a non-contiguous range by holding CTRL
On Error Resume Next
Set rData = Application.InputBox("Select range of names where unique names will be extracted:", "Data Selection", Selection.Address, Type:=8)
On Error GoTo 0
If rData Is Nothing Then Exit Sub 'Pressed cancel
Set hUnq = CreateObject("Scripting.Dictionary")
For Each rArea In rData.Areas
If rArea.Cells.Count = 1 Then
ReDim aData(1 To 1, 1 To 1)
aData(1, 1) = rArea.Value
Else
aData = rArea.Value
End If
For i = 1 To UBound(aData, 1)
For j = 1 To UBound(aData, 2)
If Not hUnq.Exists(aData(i, j)) And Len(Trim(aData(i, j))) > 0 Then hUnq(Trim(aData(i, j))) = Trim(aData(i, j))
Next j
Next i
Next rArea
Set wb = rData.Parent.Parent 'First parent is the range's worksheet, second parent is the worksheet's workbook
Set wsDest = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
wsDest.Range("A1").Resize(hUnq.Count).Value = Application.Transpose(hUnq.Items)
End Sub

This custom VBA function would create the desired result of getting the SKU code. I broke it up to show how to get each position.
Function Drop_Bucks(inputText As String) As String
Dim beginSpot As Long, endSpot As Long
'Finds last /
beginSpot = InStrRev(inputText, "/", -1, vbTextCompare) + 1
'Finds jpg
endSpot = InStrRev(inputText, ".jpg", -1, vbTextCompare)
Drop_Bucks = Replace(Mid(inputText, beginSpot, endSpot - beginSpot), "-", "_")
End Function
As a followup, you could also create the sku without VBA. If you put this formula in cell c4 with a sku in d4. It should do without macro.
=SUBSTITUTE(SUBSTITUTE(LEFT(SUBSTITUTE(SUBSTITUTE(RIGHT(SUBSTITUTE(d4, "/",REPT("?", 999)), 999),"?",""), ".jpg",REPT("?", 999)), 999),"?",""),"-","_")

This may helps you:
Option Explicit
Sub TEST()
Dim LastRow As Long, i As Long, LastRow2 As Long
Dim arr As Variant
With ThisWorkbook.Worksheets("Sheet1")
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
.Range("$A$2:$C$" & LastRow).RemoveDuplicates Columns:=Array(1, 2, 3), Header:=xlNo
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
arr = .Range("A2:C" & LastRow)
For i = LBound(arr) To UBound(arr)
LastRow2 = .Cells(.Rows.Count, "E").End(xlUp).Row
.Range("E" & LastRow2 + 1).Value = arr(i, 1) & "_1"
.Range("F" & LastRow2 + 1).Value = arr(i, 2)
Next i
For i = LBound(arr) To UBound(arr)
LastRow2 = .Cells(.Rows.Count, "E").End(xlUp).Row
.Range("E" & LastRow2 + 1).Value = arr(i, 1) & "_2"
.Range("F" & LastRow2 + 1).Value = arr(i, 3)
Next i
End With
End Sub

This builds a list of all the item duplicates and all. It will then uses the function Range.RemoveDuplicates to remove duplicates of the SKU Code in combination with the URL within the range.
Option Explicit
Sub Test()
Dim oCurSourceSheet As Worksheet
Set oCurSourceSheet = Sheet1 ' What sheet is your Source Data on?
Dim oSourceRow As Long ' Which Row/Column does your data start on?
oSourceRow = 2 ' First Row of First "Link"
Dim oSourceCol As Long
oSourceCol = 2 ' First Column of First "Link"
Dim oOutputRange As Range
Set oOutputRange = Sheet1.Range("A10") ' What Sheet/Cell do you want the output to start on/in?
Dim oCurRow As Long ' Row counter for Output
oCurRow = 1
Dim oCurSourceRow As Long
Dim oCurSourceCol As Long
For oCurSourceRow = oSourceRow To oCurSourceSheet.UsedRange.Rows.Count
For oCurSourceCol = oSourceCol To oCurSourceSheet.UsedRange.Columns.Count
oOutputRange.Cells(oCurRow, 1) = oCurSourceSheet.Cells(oCurSourceRow, 1) & "_" & oCurSourceCol - 1
oOutputRange.Cells(oCurRow, 2) = oCurSourceSheet.Cells(oCurSourceRow, oCurSourceCol)
oCurRow = oCurRow + 1
Next
Next
'Reize range from output's starting cell & remove duplicates
Set oOutputRange = oOutputRange.Resize(oCurRow - 1, 2)
oOutputRange.RemoveDuplicates Columns:=Array(1, 2)
End Sub

Try this, please: I adapted your code. The Dictionary is used just like a tool for avoiding duplicate values (due to the fact it exists...). Everything works in memory and should be very fast:
Option Base 1
Sub tgr_bis()
Dim wb As Workbook, rData As Range, wsDest As Worksheet, rArea As Range
Dim aData As Variant, aDataSorted() As String
Dim i As Long, hUnq As Scripting.Dictionary, nrColumns As Long
On Error Resume Next
Set rData = Application.InputBox("Select range of names where unique names will be extracted:", "Data Selection", Selection.Address, Type:=8)
On Error GoTo 0
If rData Is Nothing Then Exit Sub 'Pressed cancel
'Debug.Print rData.Columns.Count: Stop
If rData.Columns.Count > 6 Then MsgBox "More then 6 columns..." & vbCrLf & _
"Please select only six columns and run the procedure again", vbInformation, _
"Too many columns": Exit Sub
nrColumns = rData.Columns.Count
Set hUnq = CreateObject("Scripting.Dictionary")
For Each rArea In rData.Areas
If rArea.Cells.Count = 1 Then
ReDim aData(1 To 1, 1 To 1)
aData(1, 1) = rArea.value
Else
aData = rArea.value
End If
ReDim aDataSorted(nrColumns, 1)
Dim k As Long
k = 1
For i = 1 To UBound(aData, 1)
If Not hUnq.Exists(aData(i, 1)) And Len(Trim(aData(i, 1))) > 0 Then
aDataSorted(1, k) = aData(i, 1): aDataSorted(2, k) = aData(i, 2): aDataSorted(3, k) = aData(i, 3)
Select Case nrColumns
Case 4
If aData(i, 4) <> "" Then aDataSorted(4, k) = aData(i, 4)
Case 5
If aData(i, 4) <> "" Then aDataSorted(4, k) = aData(i, 4)
If aData(i, 5) <> "" Then aDataSorted(5, k) = aData(i, 5)
Case 6
If aData(i, 4) <> "" Then aDataSorted(4, k) = aData(i, 4)
If aData(i, 5) <> "" Then aDataSorted(5, k) = aData(i, 5)
If aData(i, 6) <> "" Then aDataSorted(6, k) = aData(i, 6)
Case > 6
MsgBox "Too many selected columns!": Exit Sub
End Select
k = k + 1
ReDim Preserve aDataSorted(nrColumns, k)
hUnq(Trim(aData(i, 1))) = Trim(aData(i, 1))
End If
Next i
Next rArea
'Process the new array in order to be tansformed in what is needed:
Dim finalCol() As String
k = k - 1: Z = 1
ReDim finalCol(2, Z)
Dim lngIndex As Long
Dim totalRows As Long
For i = 1 To k
lngIndex = 1
finalCol(1, Z) = aDataSorted(1, i) & "_" & lngIndex: lngIndex = lngIndex + 1: _
finalCol(2, Z) = aDataSorted(2, i): totalRows = totalRows + 1
Z = Z + 1: ReDim Preserve finalCol(2, Z)
finalCol(1, Z) = aDataSorted(1, i) & "_" & lngIndex: lngIndex = lngIndex + 1: _
finalCol(2, Z) = aDataSorted(3, i): totalRows = totalRows + 1
Z = Z + 1: ReDim Preserve finalCol(2, Z)
If nrColumns < 4 Then GoTo EndLoop
If aDataSorted(4, i) <> "" Then finalCol(1, Z) = aDataSorted(1, i) & "_" & lngIndex: _
lngIndex = lngIndex + 1: finalCol(2, Z) = aDataSorted(4, i): totalRows = totalRows + 1: _
Z = Z + 1: ReDim Preserve finalCol(2, Z)
If nrColumns < 5 Then GoTo EndLoop
If aDataSorted(5, i) <> "" Then finalCol(1, Z) = aDataSorted(1, i) & "_" & lngIndex: _
lngIndex = lngIndex + 1: finalCol(2, Z) = aDataSorted(5, i): totalRows = totalRows + 1: _
Z = Z + 1: ReDim Preserve finalCol(2, Z)
If nrColumns < 6 Then GoTo EndLoop
If aDataSorted(6, i) <> "" Then finalCol(1, Z) = aDataSorted(1, i) & "_" & lngIndex: _
lngIndex = lngIndex + 1: finalCol(2, Z) = aDataSorted(6, i): totalRows = totalRows + 1: _
Z = Z + 1: ReDim Preserve finalCol(2, Z)
EndLoop:
Next i
Set wb = rData.Parent.Parent
Set wsDest = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
wsDest.Range("A1:B" & totalRows) = Application.Transpose(finalCol)
End Sub
'A reference to "Microsoft Scripting Runtime" must be added. Otherwise, you can declare hUnq As Object...
And do not forget to have Option Base on tot of the module where this code exists. It is necessary to work with the way you built your initial code.
Edited: I adapted the code to accept up to six columns as you suggested. Please give it a try.
But it only check the unique SKU Code and select the first occurrence. If the other occurrences appear, the will not be considered even if they have different strings on its row. The code can be adapted to work also from this point of view, but now I think is your turn to make some tests...

Related

Excel VBA Passing Variables

I need to pass the variables max, min, and their respective locations to another sub where it will format each max and min in their respective column. I am trying to create an array that will store the locations and the values but its not working.
I was told to first identify the number of columns used and the number of rows, which is the beginning.
Rows = wsData.UsedRange.Rows.Count
Columns = wsData.UsedRange.Col.Count
j = 1
ReDim Min(j)
With wsData.Range("A3:A19")
For j = 1 To 19 'colum
Min(j) = WorksheetFunction.Min(Range(.Offset(1, j), .Offset(Row, j)))
Max = WorksheetFunction.Max(Range(.Offset(1, j), .Offset(Row, j)))
Min(j) = Min
j = j + 1
ReDim Preserve Min(j) 'saves variables
Next 'next column
End With
The code below uses the ActiveSheet which you need to change to reference the worksheet for your data. Additionally, it assumes that your data starts with Row 1. The code looks at each column in the range and stores the minimum/maximum (it does not account for multiple cells which may share the min or max value) value found in the column as well as the cell's address, in an array and then passes the array to two different subs, one which simply displays the information in a message and one which formats the the background color of the cells. This code does not perform any kind of error handling, but should get you where you want to go.
the line Option Explicit requires that all of the variables be defined using a Dim statement
the line Option Base 1 makes the default starting point for arrays 1 instead of 0
Option Explicit
Option Base 1
Sub GatherData()
Dim iRows As Long
Dim iCols As Long
Dim j As Long
Dim iMin() As Variant
Dim iMax() As Variant
Dim R As Range
iRows = ActiveSheet.UsedRange.Rows.Count
iCols = ActiveSheet.UsedRange.Columns.Count
ReDim iMin(iCols, 2)
ReDim iMax(iCols, 2)
For j = 1 To iCols
Set R = Range(Cells(1, j), Cells(iRows, j)).Find(WorksheetFunction.Min(Range(Cells(1, j), Cells(iRows, j))), LookIn:=xlValues)
iMin(j, 1) = R.Value
iMin(j, 2) = R.Address
Set R = Range(Cells(1, j), Cells(iRows, j)).Find(WorksheetFunction.Max(Range(Cells(1, j), Cells(iRows, j))), LookIn:=xlValues)
iMax(j, 1) = R.Value
iMax(j, 2) = R.Address
Next j
ListMinMax iMax(), True
ListMinMax iMin(), False
FormatMinMax iMax, "green"
FormatMinMax iMin, "yellow"
Set R = Nothing
End Sub
Sub ListMinMax(ByRef Arr() As Variant, ByVal MinMax As Boolean)
Dim strOutput As String
Dim i As Long
If MinMax = True Then
strOutput = "Maximums:" & vbCrLf & vbCrLf
Else
strOutput = "Minimums:" & vbCrLf & vbCrLf
End If
For i = 1 To UBound(Arr, 1)
strOutput = strOutput & "Cell: " & Arr(i, 2) & " = " & Arr(i, 1) & vbCrLf
Next i
MsgBox strOutput, vbOKOnly
End Sub
Sub FormatMinMax(ByRef Arr() As Variant, ByVal BGColor As String)
Dim i As Long
Select Case UCase(BGColor)
Case "GREEN"
For i = 1 To UBound(Arr, 1)
ActiveSheet.Range(Arr(i, 2)).Interior.Color = vbGreen
Next i
Case "YELLOW"
For i = 1 To UBound(Arr, 1)
ActiveSheet.Range(Arr(i, 2)).Interior.Color = vbYellow
Next i
Case Else
MsgBox "Invalid Option", vbCritical
End Select
End Sub
======================================================================
The code below does away with the need for the arrays and formats the color of the min/max values as it finds them
Sub GatherData2()
Dim iRows As Long
Dim iCols As Long
Dim j As Long
Dim R As Range
iRows = ActiveSheet.UsedRange.Rows.Count
iCols = ActiveSheet.UsedRange.Columns.Count
For j = 1 To iCols
Set R = Range(Cells(1, j), Cells(iRows, j)).Find(WorksheetFunction.Min(Range(Cells(1, j), Cells(iRows, j))), LookIn:=xlValues)
R.Interior.Color = vbYellow
Set R = Range(Cells(1, j), Cells(iRows, j)).Find(WorksheetFunction.Max(Range(Cells(1, j), Cells(iRows, j))), LookIn:=xlValues)
R.Interior.Color = vbGreen
Next j
Set R = Nothing
End Sub

How to split cell contents from multiple columns into rows by delimeter?

The code I have takes cells containing the delimiter (; ) from a column, and creates new rows (everything except the column is duplicated) to separate those values.
What I have
I need this for multiple columns in my data, but I don't want the data to overlap (ex: for 3 columns, I want there to be only one value per row in those 3 columns). It would be ideal if I could select multiple columns instead of only one as my code does now.
What I want
Sub splitByCol()
Dim r As Range, i As Long, ar
Set r = Worksheets("Sheet").Range("J2000").End(xlUp)
Do While r.Row > 1
ar = Split(r.Value, "; ")
If UBound(ar) >= 0 Then r.Value = ar(0)
For i = UBound(ar) To 1 Step -1
r.EntireRow.Copy
r.Offset(1).EntireRow.Insert
r.Offset(1).Value = ar(i)
Next
Set r = r.Offset(-1)
Loop
End Sub
Try this code
Sub Test()
Dim a, x, e, i As Long, ii As Long, iii As Long, k As Long
a = Range("A1").CurrentRegion.Value
ReDim b(1 To 1000, 1 To UBound(a, 2))
For i = LBound(a) To UBound(a)
For ii = 2 To 3
x = Split(a(i, ii), "; ")
For Each e In x
k = k + 1
b(k, 1) = k
b(k, 2) = IIf(ii = 2, e, Empty)
b(k, 3) = IIf(ii = 3, e, Empty)
b(k, 4) = a(i, 4)
Next e
Next ii
Next i
Range("A5").Resize(UBound(b, 1), UBound(b, 2)).Value = b
End Sub
I'd go this way
Sub SplitByCol()
With Worksheets("Sheet")
With .Range("B2", .Cells(.Rows.Count, "B").End(xlUp))
Dim firstColValues As Variant
firstColValues = .Value
Dim secondColValues As Variant
secondColValues = .Offset(, 1).Value
Dim thirdColValues As Variant
thirdColValues = .Offset(, 2).Value
.Offset(, -1).Resize(, 4).ClearContents
End With
Dim iRow As Long
For iRow = LBound(firstColValues) To UBound(firstColValues)
Dim currFirstColValues As Variant
currFirstColValues = Split(firstColValues(iRow, 1), "; ")
Dim currSecondColValues As Variant
currSecondColValues = Split(secondColValues(iRow, 1), "; ")
With .Cells(.Rows.Count, "C").End(xlUp).Offset(1, -1)
With .Resize(UBound(currFirstColValues) + 1)
.Value = currFirstColValues
.Offset(, 2).Value = thirdColValues(iRow, 1)
End With
End With
With .Cells(.Rows.Count, "B").End(xlUp).Offset(1, 1)
With .Resize(UBound(currSecondColValues) + 1)
.Value = currSecondColValues
.Offset(, 1).Value = thirdColValues(iRow, 1)
End With
End With
Next
End With
End Sub
Follow the code step by step by pressing F8 while the cursor is in any code line in the VBA IDE and watch what happens in the Excel user interface
EDIT
adding edited code for a more "parametric" handling by means of a helper function
Sub SplitByCol()
With Worksheets("Sheet")
With .Range("B2", .Cells(.Rows.Count, "B").End(xlUp))
Dim firstColValues As Variant
firstColValues = .Value
Dim secondColValues As Variant
secondColValues = .Offset(, 1).Value
Dim thirdColValues As Variant
thirdColValues = .Offset(, 2).Value
.Offset(, -1).Resize(, 4).ClearContents
End With
Dim iRow As Long
For iRow = LBound(firstColValues) To UBound(firstColValues)
Dim currFirstColValues As Variant
currFirstColValues = Split(firstColValues(iRow, 1), "; ")
Dim currSecondColValues As Variant
currSecondColValues = Split(secondColValues(iRow, 1), "; ")
WriteOne .Cells(.Rows.Count, "C").End(xlUp).Offset(1), _
currFirstColValues, thirdColValues(iRow, 1), _
-1, 2
WriteOne .Cells(.Rows.Count, "B").End(xlUp).Offset(1), _
currSecondColValues, thirdColValues(iRow, 1), _
1, 1
Next
End With
End Sub
Sub WriteOne(refCel As Range, _
currMainColValues As Variant, thirdColValue As Variant, _
mainValuesOffsetFromRefCel As Long, thirdColValuesOffsetFromRefCel As Long)
With refCel.Offset(, mainValuesOffsetFromRefCel)
With .Resize(UBound(currMainColValues) + 1)
.Value = currMainColValues
.Offset(, thirdColValuesOffsetFromRefCel).Value = thirdColValue
End With
End With
End Sub
Please, use the next code. It uses arrays and should be very fast for big ranges to be processed, working mostly in memory:
Sub testSplitInsert()
Dim sh As Worksheet, lastR As Long, arr, arrSp, arrFin, i As Long, j As Long, k As Long
Set sh = ActiveSheet
lastR = sh.Range("B" & sh.rows.count).End(xlUp).row
arr = sh.Range("B1:D" & lastR).Value
ReDim arrFin(1 To UBound(arr) * 10, 1 To 3) 'maximum to keep max 10 rows per each case
k = 1 'initialize the variable to load the final array
For i = 1 To UBound(arr)
arrSp = Split(Replace(arr(i, 1)," ",""), ";") 'trim for the case when somebody used Red;Blue, instead of Red; Blue
For j = 0 To UBound(arrSp)
arrFin(k, 1) = arrSp(j): arrFin(k, 3) = arr(i, 3): k = k + 1
Next j
arrSp = Split(Replace(arr(i, 1)," ",""), ";")
For j = 0 To UBound(arrSp)
arrFin(k, 2) = arrSp(j): arrFin(k, 3) = arr(i, 3): k = k + 1
Next j
Next
sh.Range("G1").Resize(k - 1, 3).Value = arrFin
End Sub
It processes the range in columns "B:D" and returns the result in columns "G:I". It can be easily adapted to process any columns range and return even overwriting the existing range, but this should be done only after checking that it return what you need...

Multiple Criteria Evaluate Match Function Prohibitively Slow?

The following code successfully executes for small data sets:
Option Explicit
Option Base 1
Sub Left()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim ws1, _
ws2 As Worksheet, _
wb As Workbook
Set wb = ThisWorkbook
Set ws1 = wb.Worksheets("Adj")
Set ws2 = wb.Worksheets("Deleted")
Dim a, _
b, _
i, _
j, _
k As Long
a = 957
b = 290150
Dim Item1, _
Item2, _
Arr() As Variant
With ws2
For i = 2 To a
.Cells(i, 6) = Left(.Cells(i, 1), 11)
.Cells(i, 7) = Right(.Cells(i, 1), 4)
Next i
End With
With ws1
For j = 2 To b
ReDim Preserve Arr(j - 1)
Item1 = Chr(34) & .Cells(j, 7) & Chr(34)
Item2 = Chr(34) & .Cells(j, 9) & Chr(34)
On Error Resume Next
k = Evaluate("=MATCH(1,('Deleted'!F:F = " & Item1 & ")*('Deleted'!G:G = " & Item2 & "),0)")
If Err.Number = 13 Then
Arr(j - 1) = ""
Else: Arr(j - 1) = k
End If
On Error GoTo 0
Next j
.Range(.Cells(2, 15), .Cells(b, 15)) = WorksheetFunction.Transpose(Arr())
End With
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
However, for large data sets - such as 290,150 rows - the macro spins its wheels. It's known that Evaluate is expensive to run and I have tried running for sample sizes of 30 (success) and 1,000 (unsuccessful) and debugged carefully. Obviously in-cell array formulation drag-and-drop is not a practical alternative. So, the problem reduces to resolving endless spinning for the given multiple criteria match function required.
How do I bypass this constraint?
Try this approach using a dictionary as a lookup:
Sub Left()
Dim wsAdj As Worksheet, wsDel As Worksheet, wb As Workbook
Dim lrDel As Long, lrAdj As Long, r As Long
Dim dict, t, arr, arrG, arrI, arrRes, k
Set wb = ThisWorkbook
Set wsAdj = wb.Worksheets("Adj")
Set wsDel = wb.Worksheets("Deleted")
lrAdj = 290150
lrDel = 957
t = Timer
'load a dictionary with lookup values constructed from wsDel ColA
Set dict = CreateObject("scripting.dictionary")
arr = wsDel.Range("A2:A" & lrDel).Value
For r = 1 To UBound(arr, 1)
k = Left(arr(r, 1), 11) & Chr(0) & Right(arr(r, 1), 4)
dict(k) = r + 1 '+1 to adjust for starting at row 2
Next r
arrG = wsAdj.Range("G2:G" & lrAdj).Value 'get the match columns as arrays
arrI = wsAdj.Range("I2:I" & lrAdj).Value
ReDim arrRes(1 To UBound(arrG, 1), 1 To 1) 'resize the "result" array
'loop the values from wsAdj
For r = 1 To UBound(arrG, 1)
k = arrG(r, 1) & Chr(0) & arrI(r, 1) 'build the "key"
If dict.exists(k) Then
arrRes(r, 1) = dict(k) 'get the matched row
End If
Next r
wsAdj.Cells(2, 15).Resize(UBound(arrRes, 1), 1).Value = arrRes 'put the array on the sheet
Debug.Print "done", Timer - t ' <1 sec
End Sub
Stating Ranges instead of Columns and removing ReDim on loop helped.

Grouping two columns to shrink row count by comparing | code optimization

I try to find a vba solution for the following problem:
I have two columns and try to group column1 in a comma separate way to have less rows.
e.g.
example:
I tried this, and it worked - but It take too long (about 300.000 Rows). Is there any better solution that task?
*Its just one part of my macro
For Each r In fr
If st = "" Then
st = Application.WorksheetFunction.Clean(Trim(ws.Cells(r.row, "L").Value))
Else
If Not IsInArray(Split(st, ","), ws.Cells(r.row, "L").Value) Then
st = st & ", " & Application.WorksheetFunction.Clean(Trim(ws.Cells(r.row, "L").Value))
End If
End If
If usrCheck = True Then
If str = "" Then
str = Application.WorksheetFunction.Clean(Trim(ws.Cells(r.row, "A").Value))
Else
If Not IsInArray(Split(str, ","), ws.Cells(r.row, "A").Value) Then
str = str & ", " & Application.WorksheetFunction.Clean(Trim(ws.Cells(r.row, "A").Value))
End If
End If
End If
Next
Maybe using Dictionary would be fast. What about:
Sub Test()
Dim x As Long, lr As Long, arr As Variant
Dim dict1 As Object: Set dict1 = CreateObject("Scripting.Dictionary")
Dim dict2 As Object: Set dict2 = CreateObject("Scripting.Dictionary")
With Sheet1 'Change accordingly
'Return your last row from column A
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
'Get array and loop through it
arr = .Range("A2:B" & lr).Value
For x = LBound(arr) To UBound(arr)
dict1(arr(x, 2)) = arr(x, 2)
Next
'Loop through dictionary filling a second one
For Each Key In dict1.keys
For x = LBound(arr) To UBound(arr)
If arr(x, 2) = Key Then dict2(arr(x, 1)) = arr(x, 1)
Next x
.Range("F" & .Cells(.Rows.Count, 6).End(xlUp).Row + 1) = Key
.Range("G" & .Cells(.Rows.Count, 7).End(xlUp).Row + 1) = Join(dict2.Items, ", ")
dict2.RemoveAll
Next
End With
End Sub
This will get you all unique items from column A though, so if there can be duplicates and you want to keep them, this is not for you =)
Try also this, please. It works only in memory and on my computer takes less then 3 seconds for 300000 rows. The range must be filtered, like in your picture. If not, the filtering can also be easily automated.
Private Sub CondensData()
Dim sh As Worksheet, arrInit As Variant, arrIn As Variant, i As Long
Dim arrFinal() As Variant, lastRow As Long, Nr As Long, El As Variant
Dim strTemp As String, k As Long
Set sh = ActiveSheet
lastRow = sh.Cells(sh.Rows.count, "A").End(xlUp).Row
arrIn = sh.Range("B2:B" & lastRow + 1).Value
'Determine the number of the same accurrences:
For Each El In arrIn
i = i + 1
If i >= 2 Then
If arrIn(i, 1) <> arrIn(i - 1, 1) Then Nr = Nr + 1
End If
Next
ReDim arrFinal(Nr, 1)
arrInit = sh.Range("A2:B" & lastRow).Value
For i = 2 To UBound(arrInit, 1)
If i = 1 Then
strTemp = arrInit(1, 1)
Else
If arrInit(i, 2) = arrInit(i - 1, 2) Then
If strTemp = "" Then
strTemp = arrInit(i, 1)
Else
strTemp = strTemp & ", " & arrInit(i, 1)
End If
Else
arrFinal(k, 0) = arrInit(i - 1, 2)
arrFinal(k, 1) = strTemp
k = k + 1: strTemp = ""
End If
End If
Next i
sh.Range("C2:D" & lastRow).Clear
sh.Range("C2:D" & k - 1).Value = arrFinal
sh.Range("C:D").EntireColumn.AutoFit
MsgBox "Solved..."
End Sub
It will return the result in columns C:D

Count string within string using VBA

I have product codes: (they are in C column of active sheet)
DO-001
DO-002
DO-003
DO-004
And I have big list of data: (they are in C column of "Sheet1")
41300100_DO-001_14215171
41300104_DO-001_14215173
K1_ISK_41300661_DO-002_13190369
NP_41533258_DO-003_14910884
DO-003_DD_44_ddd
And I want to count how many times do the product codes appear in the list of data. So the result for this case'd be: (result is H column of active sheet)
DO-001 2
DO-002 1
DO-003 2
DO-004
I have done this with this code:
Sub CountcodesPLC()
Dim i, j As Integer, icount As Integer
Dim ldata, lcodes As Long
icount = 0
lcodes = Cells(Rows.Count, 3).End(xlUp).Row
ldata = Worksheets("Sheet1").Cells(Rows.Count, 3).End(xlUp).Row
For i = 10 To lcodes
For j = 2 To ldata
If InStr(Worksheets("Sheet1").Range("C" & j), Range("C" & i)) <> 0 Then
icount = icount + 1
End If
Next j
If icount <> 0 Then
Range("H" & i).Value = icount
End If
icount = 0
Next i
End Sub
But I want to change it, so if the list of data contains some key words like "NP", "ISK", then not to count them, or if the first part of the data is the code then also not to count them, so the result for this example would be:
DO-001 2
DO-002
DO-003
DO-004
Also, I'll have around 1.000 product codes, and around 60.000 strings of data.
Will my code crash?
Seems your code is OK. But if you want to match only the first part of string (a'ka StartsWith), i'd change only this line:
If InStr(Worksheets("Sheet1").Range("C" & j), Range("C" & i)) <> 0 Then
to:
If Worksheets("Sheet1").Range("C" & j) Like Range("C" & i) & "*" Then
For further details, please see: Wildcard Characters used in String Comparisons
Use Dictionnary
Dim Dict As Scripting.Dictionary
Set Dict = New Scripting.Dictionary
Arr = Split("refer your text here", "_")
For I = LBound(Arr) To UBound(Arr)
If Dict.Exists(Arr(I)) Then
Dict(Arr(I)) = Dict(Arr(I)) + 1 'increment
Else
Dict.Add Arr(I), 1
End If
Next I
This may be OTT for the requirement but should work quite quickly.
Public Sub Sample()
Dim WkSht As Worksheet
Dim LngRow As Long
Dim AryLookup() As String
Dim VntItem As Variant
'We put what we want to search into an array, this makes it a lot quicker to search
Set WkSht = ThisWorkbook.Worksheets("Sheet1")
ReDim AryLookup(0)
LngRow = 1
Do Until WkSht.Range("A" & LngRow) = ""
If AryLookup(UBound(AryLookup, 1)) <> "" Then ReDim Preserve AryLookup(UBound(AryLookup, 1) + 1)
AryLookup(UBound(AryLookup, 1)) = Trim(UCase(WkSht.Range("A" & LngRow)))
LngRow = LngRow + 1
Loop
Set WkSht = Nothing
'Then we go down the list and check the array against each item
Set WkSht = ActiveSheet
LngRow = 1
Do Until WkSht.Range("A" & LngRow) = ""
WkSht.Range("B" & LngRow) = 0
For Each VntItem In AryLookup()
'This looks for the match without any of the exclusion items
If (InStr(1, VntItem, Trim(UCase(WkSht.Range("A" & LngRow)))) <> 0) And _
(InStr(1, VntItem, "NP") = 0) And _
(InStr(1, VntItem, "ISK") = 0) Then
WkSht.Range("B" & LngRow) = WkSht.Range("B" & LngRow) + 1
End If
Next
LngRow = LngRow + 1
Loop
Set WkSht = Nothing
MsgBox "Done"
End Sub
Basically, the 60,000 data strings will go into an array in memory, then the array will be searched against the 1,000 products. Searching in memory should be quick.
One thing I would raise is the exclusion method may produce false positives.
For example, excluding NP will exclude: -
NP_41533258_DO-003_14910884
NPA_41533258_DO-003_14910884
41533258_ANP_DO-003_14910884
You may want to think about the method overall.
Have you considered an array formula, not sure how it will perform vs code, but, you could do something along these lines, where list is in A and prod numbers in B
=SUM(IF(NOT(ISERR(SEARCH(B1 & "_",$A$1:$A$5,1)))*(ISERR(SEARCH("NP_",$A$1:$A$5,1))),1,0))
Where "NP" would be replaced by a range containing the exclusions, I've left as NP to show what's happening.
The code would be like this. But I don't know the speed.
Sub test()
Dim vDB, vLook, vSum(), Sum As Long
Dim Ws As Worksheet, dbWs As Worksheet
Dim s As String, sF As String, sCode As String
Dim i As Long, j As Long, n As Long
Set dbWs = Sheets("Sheet1")
Set Ws = ActiveSheet
With Ws
vLook = .Range("a1", .Range("a" & Rows.Count).End(xlUp))
End With
With dbWs
vDB = .Range("c1", .Range("c" & Rows.Count).End(xlUp))
End With
n = UBound(vLook, 1)
ReDim vSum(1 To n, 1 To 1)
For i = 1 To n
sF = Split(vLook(i, 1), "-")(0)
sCode = Replace(vLook(i, 1), sF, "")
Sum = 0
For j = 1 To UBound(vDB, 1)
s = vDB(j, 1)
If Left(s, Len(sF)) = sF Or InStr(s, "NP") Or InStr(s, "ISK") Then
Else
If InStr(s, sCode) Then
Sum = Sum + 1
End If
End If
Next j
If Sum > 0 Then
vSum(i, 1) = Sum
End If
Next i
Ws.Range("h1").Resize(n) = vSum
End Sub

Resources