Excel How To Count Dynamic values for Duplicates - excel

I seem to be having issues finding a solution,
I want to count duplications in a row, the row has 100 columns. I Just want to count many how duplications across the row.
For example,
1,2,3,1,4,9,2,9,1,4
I just want to see how many times the same set of numbers show up.
1 = 3
2 = 2
3 = 0
4 = 2
9 = 2
For example, 3 + 2 + 0 + 2 + 2 = 9
This row has 9 duplications. ie the same value is being displayed more than once. However the value is dynamic.

The VBA function below is a UDF, meaning it's like a normal Excel worksheet function but doing designed to do precisely what you want. Install it in a standard code module.
Function CountDuplicates(Rng As Range) As Integer
' set a Reference to "Microsoft Scripting Runtime"
Dim Fun As Integer ' function return value
Dim Uniques As Scripting.Dictionary ' list of occurrences
Dim Arr As Variant ' array of all values
Dim C As Long
Set Uniques = CreateObject("Scripting.Dictionary")
Arr = Rng.Value
With Uniques
For C = 1 To UBound(Arr, 2)
If Not IsEmpty(Arr(1, C)) Then
If .Exists(Arr(1, C)) Then
.Item(Arr(1, C)) = .Item(Arr(1, C)) + 1
Else
.Add Arr(1, C), 0
End If
End If
Next C
For C = 0 To .Count - 1
Fun = Fun + .Items(C)
Next C
End With
CountDuplicates = Fun
End Function
A standard code module is one that you must add to your project. Its default name will be like Module1 but you can change it to anything you like (wrong syntax names will be rejected). Call the function from the worksheet by entering its call in any cell, for example.
= CountDuplicates(A2:DD2)
This function will return the number of all duplicates counted in the defined range, excluding unique values. Look at the code. When an item is found for the first time a value of 0 is recorded against it. Thereafter, each time it is found again 1 is added to the number of recurrences already found. In the end all values will be added up to return the total count. This method ensures that all first occurrences will be counted as 0 (meaning not counted). Only repeats are included in the returned total.
As with other Excel functions, the result will appear in the cell containing the formula. You can copy that formula down as you do with any other, meaning the original above must be in row 2. If you paste it elsewhere consider the use of absolute addressing to define the action range.

If you have O365 with the UNIQUE function, you can use:
=COUNT(A1:J1)-COUNT(UNIQUE(A1:J1,TRUE,TRUE))

Another way
=COUNT(A1:J1)-SUMPRODUCT(--(FREQUENCY(A1:J1,A1:J1)=1))
or
=SUMPRODUCT(--(COUNTIF(A1:J1,A1:J1)>1))

Related

VBA nesting loops to calculate multiple employee hours

I have been working on getting a timesheet macro that will take a data dumps and make do a few things.
Ultimately I am not familiar with the syntax of VBA and have got close however am needing help with getting this finished.
Below will be my code and comments where I am working on code as well as a screenshot for reference of the spreadsheet.
My question is how do I properly write the syntax using vars?
For instance in this line of code: If IsNumeric(Cells("Fr").Value) Then
I am geting errors and am unsure how I would enter the r value from the loop.
This applies to a few of the other lines I was getting errors for but didn't know
how to use r to identify a row.
Sub sum()
Dim r As Integer, c As Integer, s As Double, t As Integer, g As Integer
r = 2 'looping var
c = 3 'looping var
s = 0 'var for sum
g = 0
t = ActiveSheet.UsedRange.Rows.Count 'var for total rows
Do Until r = t
If Not IsEmpty(Range("Ar").Value) = True Then 'check if user name is present then
'Detect the next cell that contains data in the user name column
'Use that number between the two as a var (g) that will be used to run the embedded looping
'essentially redefining the other loop each time to account for the different number of clock ins per user
Do Until c = g 'Loop for until the next name was detected via var (g)
If IsNumeric(Cells("Fr").Value) Then 'check if Billable has a number then
s = s + Range("r, F").Value 'adds cell value (numbers only) to sum
c = c + 1 'add 1 to the value of c
Loop 'closes embedded loop once values have been added up
Range("Fr") = s 'Replace Cell (Fr) with the sum value
s = 0 'reset the value of the sum
r = r + 1
Loop
End Sub

Check for at least one identical value in different column ranges based on ID

I'm trying to solve a problem in VBA and after a long time of browsing the web for solutions, I really hope someone is able to help me.
It's actually not a very hard task, but with very little programming and VBA knowledge as a new learner, I hope I can find a useful tip or solution with the help of the community.
So my problem is as follows:
I have a table with 3 columns, the first is filled with a number to use as an ID. Column 2 and 3 have different values that needs to be compared:
What I'd like to do is select the range of column rows of column 2 and 3 based on the same ID. Once I have selected the relevant ranges of the columns, I want to compare if one name of column 2 matches one name of column 3.
So there is no need to have all names of the desired column ranges to match. One name match is enough. If a name matches, it should automatically fill in a new column "result" with 1 for match (0 for no match).
Do you have an idea, how I can select specific cells of a column based on an identifier?
Dim ID_counter As Long
ID_counter = 1
If Cell.Value = ID_counter IN Range("Column1")
Then Range("Column2").Select
AND Range("Column3").Select
WHERE ID_counter is the same
In Column4 (If one Cell.Value IN Range("Column2-X:Column2-Y")
IS IDENTICAL TO Range("Column3-X:Column3-Y"), return 1, else return 0
End Sub
Many thanks in advance for your help!
This works for your example so perhaps you can generalise it. The formula in D2 is
=IF(A2=A1,"",MAX(IF($A$2:$A$10=A2,COUNTIF($B$2:$B$10,$C$2:$C$10))))
and is an array formula so must be confirmed with CTRL, SHIFT and ENTER.
Array alternative via Match() function
This approach compares the string items of columns B and C by passing two arrays (named b,c) as arguments (c.f. section [1]):
chk = Application.Match(b, c, 0)
The resulting chk array reflects all findings of the first array's items via (1-based) position indices of corresponding items in the second array.
Non-findings return an Error 2042 value (c.f. section [2]b)); assumption is made that data are grouped by id.
Sub OneFindingPerId()
'[0]get data
Dim data: data = Sheet1.Range("A1:D10") ' << project's sheet Code(Name)
Dim b: b = Application.Index(data, 0, 2) ' 2nd column (B)
Dim c: c = Application.Index(data, 0, 3) ' 3rd column (C)
'[1]get position indices of identic strings via Match() function
Dim chk: chk = Application.Match(b, c, 0) ' found row nums of a items in b
'[2]loop found position indices (i.e. no error 2042)
Dim i As Long
For i = 2 To UBound(chk) ' omit header row
'a) define start index of new id and initialize result with 0
If data(i, 1) <> data(i - 1, 1) Then
Dim newId As Long: newId = i
data(newId, 4) = 0
End If
'b) check if found row index corresponds to same id
If Not IsError(chk(i, 1)) Then ' omit error 2042 values
If data(chk(i, 1), 1) = data(i, 1) Then ' same ids?
If data(newId, 4) = 0 Then data(newId, 4) = 1 ' ~> result One if first occurrence
End If
End If
Next i
'[3]write results
Sheet1.Range("A1").Resize(UBound(data), UBound(data, 2)) = data
End Sub
First enter this user defined function in a standard module:
Public Function zool(r1, r2, r3) As Integer
Dim i As Long, v1 As Long, v2 As String
Dim top As Long, bottom As Long
zool = 0
v1 = r1.Value
top = r1.Row
' determine limits to check
For i = top To 9999
If v1 <> r1.Offset(i - top, 0).Value Then
Exit For
End If
Next i
bottom = i - 1
For i = top To bottom
v2 = Cells(i, "B").Value
If v2 <> "" Then
For j = top To bottom
If v2 = Cells(j, "C").Value Then zool = 1
Next j
End If
Next i
End Function
Then in D2 enter:
=IF(OR(A2="",A2=A1),"",zool(A2,B2,C2))
and copy downwards:
(this assumes that the data has been sorted or organized by ID first)

Find the smallest sequence of move needed to re-order rows according to an array

I am working on a VBA script that sorts rows according to a couple of custom criteria. Since manipulating Excel rows is slow (big rows with various styles), I am doing the sorting through an object in memory:
Generate a jagged array representing the worksheet (containing only the relevant information used in the sorting process).
Sort the jagged array by applying a combination of quick-sort algorithm.
Regenerate the worksheet by using the sorted jagged array as a reference
Step 1 and 2 are only taking 0,84s to proceed (for my biggest worksheet). But the last step, re-generating the excel worksheet, takes a very long time: 129,11s in total !
Here is a simplified example of my code to regenerate the sheet:
Dim WS As Worksheet: Set WS = Worksheets("MySheet")
Dim EndRowIndex As Integer: EndRowIndex = WS.UsedRange.Rows.Count
Dim Destination As Integer: Destination = EndRowIndex + 1
Dim rowIndex As Integer
Dim i As Integer
For i = 1 To EndRowIndex
rowIndex = new_order_array(i)
WS.Rows(rowIndex).Copy
WS.Rows(destination).Insert Shift:=xlDown 'Copying the rows in the correct order at the bottom
destination = destination + 1 'incrementing the destination row (so it stays at the bottom)
Next
Application.CutCopyMode = False
WS.Rows("1:"& endRowIdex ).Delete 'Deleting the old unordered rows from the sheet
( new_order_array was generated in step 2, it has as many element as there are rows in the worksheet. It indicate which row need to be moved where: new_order_array(1) = 3, means that the row 3 need to become the row 1. )
As you can see, this is a simple but naive re-ordering. I copy every row in the correct order at the bottom, then delete every unordered row at the top.
In order to fully optimize the process, I would need to re-order the worksheet by using the minimal number of moves. Currently, regenerating a worksheet of N rows requires N copy-pasting, while moving rows cleverly would required at most N-1 moves. How can I find the smallest sequence of moves needed to re-order rows according to an array ?
I don't know were to begin my research for this task... are there existing algorithms on this subject ? Is this problem named (useful for keywords)? Did I miss something else that might improve performance (I have already disabled visual updates during the process)? Any help would be greatly appreciated.
Here's a fairly quick sorting algorithm in n steps.
The scrambled data:
'demo
Cells.Clear
Dim arr(1 To 100)
For i = 1 To 100
arr(i) = i
Next i
'scramble
Randomize
Dim rarr(1 To 100)
x = 100
While x > 0
r = Int(Rnd * x) + 1
rarr(101 - x) = arr(r)
arr(r) = arr(x)
x = x - 1
Wend
For i = 1 To 100
Cells(i, 1) = rarr(i)
Next i
The sort:
'sort
sp = 1 'start position
While sp < 101
If rarr(sp) = sp Then
WS.Rows(sp).Copy
WS.Rows(destination).Insert Shift:=xlDown
destination = destination + 1
sp = sp + 1
Else
d = rarr(rarr(sp))
rarr(rarr(sp)) = rarr(sp)
rarr(sp) = d
End If
Wend
For i = 1 To 100
Cells(i, 2) = rarr(i)
Next i
End Sub
The rarr array has been restored.
It works by swapping the first element with the element at the first element's position, and repeats this until the correct element is in position, copy/pastes it, and then moves onto processing element 2, and continues like this through the whole array.
It is guaranteed to work (on a contiguous set of integers 1..k) because once an element is in it's correct position, it is not referenced again.

Count number of X+ occurrences of value in range

I am working on a project and was wondering if there might be a faster way of doing something that seems easy, but is fairly time consuming.
Pretend I have a 10 cell column filled with random integers from 1-10:
1
1
1
5
5
8
8
8
9
9
I want to get a count of x+ occurrence of this column. Func(1)=4 [since there are 4 unique values with at least 1 occurrence]; Func(2) =4; func(3)=2 [since only 2 unique values occur at least 3 times]
Right now I filter through each possible integer, then count occurrences. If occurrences >=x then count +=1. Then cycle through through each integer. It work, but on larger ranges of cells with greater range of integers, it is a bit slow. Given Excel's flexibility and the power of VBA, I'm wondering if anyone has an idea that is more efficient.
One approach might be using a function like the below (but you'll need to add a reference by doing: Open VB Editor > Click Tools > References > Scroll down to "Microsoft Scripting Runtime" > Tick it > Click OK)
Option Explicit
Public Function CountNumericOccurrences(ByVal someRange As Range, ByVal minimumOccurrenceCount As Long) As Long
' "someRange" can be a contiguous or non-contiguous range of cells
' "minimumOccurrenceCount" is how many occurrences must be present before that value is counted.
' This function will only count numbers (strings, blanks, etc are ignored).
Dim uniqueCounts As Scripting.Dictionary
Set uniqueCounts = New Scripting.Dictionary
Dim contiguousArea As Range
For Each contiguousArea In someRange.Areas
If contiguousArea.Cells.Count > 1 Then ' Unlikely that range would contain any single-cell areas
Dim inputToCheck As Variant
inputToCheck = contiguousArea.Value
Dim rowIndex As Long
Dim columnIndex As Long
Dim currentKey As String
For rowIndex = LBound(inputToCheck, 1) To UBound(inputToCheck, 1)
For columnIndex = LBound(inputToCheck, 2) To UBound(inputToCheck, 2)
If Application.IsNumber(inputToCheck(rowIndex, columnIndex)) Then ' IsNumeric returns True for vbEmpty, so isNumber is used instead.
currentKey = CStr(inputToCheck(rowIndex, columnIndex))
If Not uniqueCounts.Exists(currentKey) Then uniqueCounts.Add currentKey, 0 ' Default value
uniqueCounts(currentKey) = uniqueCounts(currentKey) + 1
End If
Next columnIndex
Next rowIndex
ElseIf Application.IsNumber(contiguousArea) Then ' Handle single-cell edge case
currentKey = CStr(contiguousArea) ' We repeat ourselves here. Could create a "default dictionary" class, but only 3 lines repeated.
If Not uniqueCounts.Exists(currentKey) Then uniqueCounts.Add currentKey, 0 ' Default value
uniqueCounts(currentKey) = uniqueCounts(currentKey) + 1
End If
Next contiguousArea
For rowIndex = 0 To (uniqueCounts.Count - 1)
If uniqueCounts.Items(rowIndex) >= minimumOccurrenceCount Then
CountNumericOccurrences = CountNumericOccurrences + 1
End If
Next rowIndex
End Function
If you put it into a new module, you can call it from the worksheet as such:
I tested it with a range consisting of 200k cells, and it took ~4 seconds (quite slow). Maybe using a collection would be a better approach.
You could also just call it as part of a regular procedure e.g.:
Option Explicit
Private Sub SomeProcedure()
Dim someValue As Long
someValue = CountNumericOccurrences(ThisWorkbook.Worksheets("Sheet1").Range("A1:A200000"), 3)
MsgBox someValue
End Sub

Excel: filter/delete certain numbers in one cell

Please help,
I have excel with numbers in 2 columns for example:
10 10
20 2010, 2011
30 30100, 30200,30500
40 40
And the result I want to have is as follows:
1,2,3,4,5,6,7,8,9
10,11
100,200,500
1,2,3,4,5,6,7,8,9
So if the result between column 1 and 2 is 0 to have numbers from 1 to 9.
Is there a way to do this??
If you want to remove the characters in column one from any part of the values in column two, then try:
=IF(A1=B1,"1,2,3,4,5,6,7,8,9",SUBSTITUTE(SUBSTITUTE(B1," ",""),A1,""))
If you only want to remove the column one values from the beginning part of the column two values, then try:
=IF(B1=C1,"1,2,3,4,5,6,7,8,9",MID(SUBSTITUTE(SUBSTITUTE(","&C1," ",""),"," &B1,","),2,99))
Note that, for the final result, I removed the spaces as you showed in your example.
Using Second formula:
Here is a VBA approach:
Function AdHoc(x As Variant, y As Variant) As String
Dim v As Variant
Dim i As Long
Dim s As String
v = Split(y, ",")
If Val(x) = Val(v(0)) Then
s = "1,2,3,4,5,6,7,8,9"
Else
For i = 0 To UBound(v)
v(i) = Mid(Trim(v(i)), 1 + Len(Trim(Str(x))))
Next i
s = Join(v, ",")
End If
AdHoc = s
End Function
If this is entered into a standard code module, then AdHoc can be used directly in the worksheet:
In the above I simply entered =adhoc(A1,B1) in cell C1 and then copied down.

Resources