Count number of X+ occurrences of value in range - excel

I am working on a project and was wondering if there might be a faster way of doing something that seems easy, but is fairly time consuming.
Pretend I have a 10 cell column filled with random integers from 1-10:
1
1
1
5
5
8
8
8
9
9
I want to get a count of x+ occurrence of this column. Func(1)=4 [since there are 4 unique values with at least 1 occurrence]; Func(2) =4; func(3)=2 [since only 2 unique values occur at least 3 times]
Right now I filter through each possible integer, then count occurrences. If occurrences >=x then count +=1. Then cycle through through each integer. It work, but on larger ranges of cells with greater range of integers, it is a bit slow. Given Excel's flexibility and the power of VBA, I'm wondering if anyone has an idea that is more efficient.

One approach might be using a function like the below (but you'll need to add a reference by doing: Open VB Editor > Click Tools > References > Scroll down to "Microsoft Scripting Runtime" > Tick it > Click OK)
Option Explicit
Public Function CountNumericOccurrences(ByVal someRange As Range, ByVal minimumOccurrenceCount As Long) As Long
' "someRange" can be a contiguous or non-contiguous range of cells
' "minimumOccurrenceCount" is how many occurrences must be present before that value is counted.
' This function will only count numbers (strings, blanks, etc are ignored).
Dim uniqueCounts As Scripting.Dictionary
Set uniqueCounts = New Scripting.Dictionary
Dim contiguousArea As Range
For Each contiguousArea In someRange.Areas
If contiguousArea.Cells.Count > 1 Then ' Unlikely that range would contain any single-cell areas
Dim inputToCheck As Variant
inputToCheck = contiguousArea.Value
Dim rowIndex As Long
Dim columnIndex As Long
Dim currentKey As String
For rowIndex = LBound(inputToCheck, 1) To UBound(inputToCheck, 1)
For columnIndex = LBound(inputToCheck, 2) To UBound(inputToCheck, 2)
If Application.IsNumber(inputToCheck(rowIndex, columnIndex)) Then ' IsNumeric returns True for vbEmpty, so isNumber is used instead.
currentKey = CStr(inputToCheck(rowIndex, columnIndex))
If Not uniqueCounts.Exists(currentKey) Then uniqueCounts.Add currentKey, 0 ' Default value
uniqueCounts(currentKey) = uniqueCounts(currentKey) + 1
End If
Next columnIndex
Next rowIndex
ElseIf Application.IsNumber(contiguousArea) Then ' Handle single-cell edge case
currentKey = CStr(contiguousArea) ' We repeat ourselves here. Could create a "default dictionary" class, but only 3 lines repeated.
If Not uniqueCounts.Exists(currentKey) Then uniqueCounts.Add currentKey, 0 ' Default value
uniqueCounts(currentKey) = uniqueCounts(currentKey) + 1
End If
Next contiguousArea
For rowIndex = 0 To (uniqueCounts.Count - 1)
If uniqueCounts.Items(rowIndex) >= minimumOccurrenceCount Then
CountNumericOccurrences = CountNumericOccurrences + 1
End If
Next rowIndex
End Function
If you put it into a new module, you can call it from the worksheet as such:
I tested it with a range consisting of 200k cells, and it took ~4 seconds (quite slow). Maybe using a collection would be a better approach.
You could also just call it as part of a regular procedure e.g.:
Option Explicit
Private Sub SomeProcedure()
Dim someValue As Long
someValue = CountNumericOccurrences(ThisWorkbook.Worksheets("Sheet1").Range("A1:A200000"), 3)
MsgBox someValue
End Sub

Related

Excel How To Count Dynamic values for Duplicates

I seem to be having issues finding a solution,
I want to count duplications in a row, the row has 100 columns. I Just want to count many how duplications across the row.
For example,
1,2,3,1,4,9,2,9,1,4
I just want to see how many times the same set of numbers show up.
1 = 3
2 = 2
3 = 0
4 = 2
9 = 2
For example, 3 + 2 + 0 + 2 + 2 = 9
This row has 9 duplications. ie the same value is being displayed more than once. However the value is dynamic.
The VBA function below is a UDF, meaning it's like a normal Excel worksheet function but doing designed to do precisely what you want. Install it in a standard code module.
Function CountDuplicates(Rng As Range) As Integer
' set a Reference to "Microsoft Scripting Runtime"
Dim Fun As Integer ' function return value
Dim Uniques As Scripting.Dictionary ' list of occurrences
Dim Arr As Variant ' array of all values
Dim C As Long
Set Uniques = CreateObject("Scripting.Dictionary")
Arr = Rng.Value
With Uniques
For C = 1 To UBound(Arr, 2)
If Not IsEmpty(Arr(1, C)) Then
If .Exists(Arr(1, C)) Then
.Item(Arr(1, C)) = .Item(Arr(1, C)) + 1
Else
.Add Arr(1, C), 0
End If
End If
Next C
For C = 0 To .Count - 1
Fun = Fun + .Items(C)
Next C
End With
CountDuplicates = Fun
End Function
A standard code module is one that you must add to your project. Its default name will be like Module1 but you can change it to anything you like (wrong syntax names will be rejected). Call the function from the worksheet by entering its call in any cell, for example.
= CountDuplicates(A2:DD2)
This function will return the number of all duplicates counted in the defined range, excluding unique values. Look at the code. When an item is found for the first time a value of 0 is recorded against it. Thereafter, each time it is found again 1 is added to the number of recurrences already found. In the end all values will be added up to return the total count. This method ensures that all first occurrences will be counted as 0 (meaning not counted). Only repeats are included in the returned total.
As with other Excel functions, the result will appear in the cell containing the formula. You can copy that formula down as you do with any other, meaning the original above must be in row 2. If you paste it elsewhere consider the use of absolute addressing to define the action range.
If you have O365 with the UNIQUE function, you can use:
=COUNT(A1:J1)-COUNT(UNIQUE(A1:J1,TRUE,TRUE))
Another way
=COUNT(A1:J1)-SUMPRODUCT(--(FREQUENCY(A1:J1,A1:J1)=1))
or
=SUMPRODUCT(--(COUNTIF(A1:J1,A1:J1)>1))

VBA code like maxif

I have a VBA challenge, I have spent quite some time trying to solve. I am using the project management template sheet that is to be found on this link:
https://www.vertex42.com/ExcelTemplates/excel-gantt-chart.html
In this I would like to make a function, that finds the minimum and max date of the levels below automatically.
E.g. in row 8 where the WBS is 1, I would like a function in column E that finds the max date of all the rows that start with 1 (e.g. 1.1, 1.2, 1.3)
I have tried this:
Function maxIfs(maxRange As Range, criteriaRange As Range, criterion As \Variant) As Variant
maxIfs = Empty
For i = 1 To maxRange.Cells.count
If Left(criteriaRange.Cells(i).text, findN(criteriaRange.Cells(i).value)) = Criteria.value Then
If maxIfs = Empty Then
maxIfs = maxRange.Cells(i).value
Else
maxIfs = Application.WorksheetFunction.Max(maxIfs, maxRange.Cells(i).value)
End If
End If
Next
End Function
Function findN(text As Variant) As Integer
'Gives the position of the nth delimiter
Dim found As Integer
Dim place As Integer
found = 0
place = 0
For i = 1 To Len(text) + 1 ' Add 1 as we start at 1 not zero
place = found 'this will be 0 the first time round
found = InStr(found + 1, text, ".")
Next i
findN = place
End Function
I think the issue with the code is, that the WBS column is a function, and therefore there is some issues with the if statement. However, I am not sure if this is true, or how to solve this. Do you have any suggestions?

Array Formula into Regular one

Hi everyone, by using an array formulas to calculate (in the above example):
Count unique customers that had purchased only less than 5 units of only product 1 which area code match only with the adjacent D cells
I Use the following array formula to be in E11:
=SUM(IF(FREQUENCY(IF($G$2:$G$7=D11,
IF($I$2:$I$7="Product 1",IF($J$2:$J$7<5,IF($E$2:$E$7<>"",
MATCH($E$2:$E$7,$E$2:$E$7,0))))),ROW($E$2:$E$7)-ROW(G2)+1),1))
this formula doing great, at the same time when using it thru very huge database containing tons of rows and columns, excel takes a bout 3 minutes to calculate only one cell which is terrible to continue like that
is there any way to convert this array formula to regular one ... any help will be appreciated to the maximum ... Thanks in advance
Sorry for the late answer.
I created an UDF which is focused on doing the calculation several times without running the whole range multiple times.
Public Function getCounts(AreaStr As Variant, AreaRng As Range, CustomerRng As Range, ProductRng As Range, SalesRng As Range, Optional ProductName As String = "Product 1", Optional lessThan As Double = 5) As Variant
'make sure AreaStr is an array
If TypeOf AreaStr Is Range Then AreaStr = AreaStr.Value2
If Not IsArray(AreaStr) Then
AreaStr = Array(AreaStr)
ReDim Preserve AreaStr(1 To 1)
End If
'shorten the range (this way you can use whole columns)
If SalesRng(SalesRng.Cells.Count).Formula = "" Then Set SalesRng = SalesRng.Parent.Range(SalesRng.Cells(1), SalesRng(SalesRng.Cells.Count).End(xlUp))
'make sure all ranges have the same size
Set AreaRng = AreaRng.Resize(SalesRng.Rows.Count)
Set CustomerRng = CustomerRng.Resize(SalesRng.Rows.Count)
Set ProductRng = ProductRng.Resize(SalesRng.Rows.Count)
'Load values in variables to increase speed
Dim SalesValues As Variant, UserValues As Variant, ProductValues As Variant
SalesValues = AreaRng
UserValues = CustomerRng
ProductValues = ProductRng
'create temporary arrays to hold the values
Dim buffer() As Variant, expList() As Variant
ReDim buffer(1 To UBound(UserValues))
ReDim expList(1 To UBound(AreaStr), 1 To 1)
Dim i As Long, j As Double, k As Long
For i = 1 To UBound(AreaStr)
expList(i, 1) = buffer
Next
buffer = Array(buffer, buffer)
buffer(0)(1) = 0
For i = 1 To UBound(UserValues)
If ProductValues(i, 1) = ProductName Then 'this customer purchased our product
j = Application.IfError(Application.Match(UserValues(i, 1), buffer(0), 0), 0)
If j = 0 Then 'first time this customer in this calculation
j = i
buffer(0)(j) = UserValues(i, 1) 'remember the customer name (to not calculate him again later)
If Application.SumIfs(SalesRng, CustomerRng, UserValues(i, 1), ProductRng, ProductName) < lessThan Then
buffer(1)(j) = 1 'customer got less than "lessThan" -> remember that
End If
End If
If buffer(1)(j) = 1 Then 'check if we need to count the customer
k = Application.IfError(Application.Match(SalesValues(i, 1), AreaStr, 0), 0) 'check if the area is one of the areas we are looking for
If k Then expList(k, 1)(j) = 1 'it is -> set 1 for this customer/area combo
End If
End If
Next
For i = 1 To UBound(AreaStr) 'sum each area
expList(i, 1) = Application.Sum(expList(i, 1))
Next
getCounts = expList 'output array
End Function
I assume that you will be able to include it as an UDF without my help.
In the sheet you would use (for your example) E11:E16
=getCounts(D11:D15,G2:G7,E2:E7,I2:I7,J2:J7)
simply select the range of E11:E16 and enter the formula, then confirm it with CSE.
you also could use only =getCounts(D11,$G$2:$G$7,$E$2:$E$7,$I$2:$I$7,$J$2:$J$7) at E11 and then copy down... but that would be pretty slow.
The trick is, that we calculate the sum of the set for every customer, which at least bought it one time. Then we store 1 if it is less then your criteria. This goes for the general array. Every area you are looking for, will get its own array too. Here we also store the 1 at the same pos. As every costomer only gets calculated one time, having him multiple times doesn't matter.
the formula simply will be used like this:
getCounts(AreaStr,AreaRng,CustomerRng,ProductRng,SalesRng,[ProductName],[lessThan])
AreaStr: the area code you are looking for. should be an array of multiple cells to make the udf worth using it
AreaRng: the range where the area names are stored
CustomerRng: the range where the customer names are stored
ProductRng: the range where the product names are stored
SalesRng: the range where the sale counts are stored
ProductName (optional): the product you are looking for. Will be "Product 1" if omited
lessThan (optional): the trigger point for the sum of products. Will be 5 if omited
Most parts should be self explaining, but if you still have any questions, just ask ;)
OK, I am not sure of I understood all of the conditions and accumulation, but here is a VBA function that I think should do it.
First, open VBA from the Excel Developer menu. Then in VBA, create a new module from the Insert menu (just let it be Module1). Then paste the following 2 functions into the VBA module.
Public Function AreaUniqueCustomersLessThan(ReportAreaRange, AreaRange, ProductRange, SalesRange, CustomerRange)
On Error GoTo Err1
Dim RptAreas() As Variant
Dim Areas() As Variant, Products() As Variant, Sales() As Variant, Customers As Variant
RptAreas = ArrayFromRange(ReportAreaRange)
Areas = ArrayFromRange(AreaRange)
Products = ArrayFromRange(ProductRange)
Sales = ArrayFromRange(SalesRange)
Customers = ArrayFromRange(CustomerRange)
Dim r As Long, s As Long 'report and source rows indexes
Dim mxr As Long, mxs As Long
mxr = UBound(RptAreas, 1)
mxs = UBound(Areas, 1)
'encode the ReportAreasList into accumulation array indexes
Dim AreaCustomers() As Collection
Dim i As Long, j As Long
Dim colAreas As New Collection
ReDim AreaCustomers(1 To mxr)
For r = 1 To mxr
On Error Resume Next
'Do we have the area already?
j = colAreas(RptAreas(r, 1))
If Err.Number <> 0 Then
'Add a new area to the collection and array
i = i + 1
colAreas.Add i, RptAreas(r, 1)
Set AreaCustomers(i) = New Collection
j = i
End If
Next r
'now scan the source rows, accumulating distinct customers
' for any ReportAreas
For s = 1 To mxs
'is this row's Arera in the report Area list?
i = 0
On Error Resume Next
i = colAreas(Areas(s, 1))
On Error GoTo Err1
If i > 0 Then
'this is a report Area code, so check the conditions
If Products(s, 1) = "Product 1" Then
If Sales(s, 1) < 5 Then
On Error Resume Next 'just ignore any duplicate errors
AreaCustomers(i).Add Customers(s, 1), Customers(s, 1)
On Error GoTo Err1
End If
End If
End If
Next s
'finally, return to the report area codes, returning the distinct count
' of customers
Dim count() As Variant
ReDim count(1 To mxr, 1 To 1)
For r = 1 To mxr
count(r, 1) = AreaCustomers(colAreas(RptAreas(r, 1))).count
Next r
AreaUniqueCustomersLessThan = count ' "foo"
Exit Function
Err1:
AreaUniqueCustomersLessThan = "%ERR(" & Str(Err.Number) & ")%" & Err.Description
Exit Function
Resume
End Function
'handle all of the cases, checking and conversions to convert
' a variant range into an array of Variant(1 to n, 1 to 1)
' (we do this because it makes data access very fast)
Function ArrayFromRange(varRange As Variant)
Dim rng As Range
Dim A() As Variant
Set rng = varRange
'Check for degenerate cases
If rng Is Nothing Then
'do nothing
ElseIf rng.count = 0 Then
'do nothing
ElseIf rng.count = 1 Then
ReDim A(1 To 1, 1 To 1)
A(1, 1) = rng.Value
Else
A = rng.Value
End If
ArrayFromRange = A
End Function
Finally, go to your Array Formula area and paste in the following Array formula for the "Sales < 5" list: {=AreaUniqueCustomersLessThan(D$11:D$16, G$2:G$7, I$2:I$7,J$2:J$7,E$2:E$7)} Note that the first range must be the same length as the Array Formula range itself. And the other four ranges (the source data ranges) should all be the same length (they do not have to be the same length as the first range).

Generating a list of random words in Excel, but no duplicates

I'm trying to generate words in Column B from a list of given words in Column A.
Right now my code in Excel VBA does this:
Function GetText()
Dim GivenWords
GivenWords = Sheets(1).Range(Sheets(1).[a1], Sheets(1).[a20])
GetText = A(Application.RandBetween(1, UBound(A)), 1)
End Function
This generates a word from the list I have provided in A1:A20, but I don't want any duplicates.
GetText() will be run 15 times in Column B from B1:B15.
How can I check for any duplicates in Column B, or more efficiently, remove the words temporarily from the list once it has been used?
For example,
Select Range A1:A20
Select one value randomly (e.g A5)
A5 is in Column B1
Select Range A1:A4 and A6:A20
Select one value randomly (e.g A7)
A7 is in Column B2
Repeat, etc.
This was trickier than I thought. The formula should be used as a vertical array eg. select the cells where you want the output, press f2 type =gettext(A1:A20) and press ctrl+shift+enter
This means that you can select where your input words are in the worksheet, and the output can be upto as long as that list of inputs, at which point you'll start getting #N/A errors.
Function GetText(GivenWords as range)
Dim item As Variant
Dim list As New Collection
Dim Aoutput() As Variant
Dim tempIndex As Integer
Dim x As Integer
ReDim Aoutput(GivenWords.Count - 1) As Variant
For Each item In GivenWords
list.Add (item.Value)
Next
For x = 0 To GivenWords.Count - 1
tempIndex = Int(Rnd() * list.Count + 1)
Aoutput(x) = list(tempIndex)
list.Remove tempIndex
Next
GetText = Application.WorksheetFunction.Transpose(Aoutput())
End Function
Here's how I would do it, using 2 extra columns, and no VBA code...
A B C D
List of words Rand Rank 15 Words
Apple =RAND() =RANK(B2,$B$2:$B$21) =INDEX($A$2:$A$21,MATCH(ROW()-1,$C$2:$C$21,0))
copy B2 and C2 down as far as the list, and drag D down for however many words you want.
Copy the word list somewhere, as every time you change something on the sheet (or recalculate), you will get a new list of words
Using VBA:
Sub GetWords()
Dim Words
Dim Used(20) As Boolean
Dim NumChosen As Integer
Dim RandWord As Integer
Words = [A1:A20]
NumChosen = 0
While NumChosen < 15
RandWord = Int(Rnd * 20) + 1
If Not Used(RandWord) Then
NumChosen = NumChosen + 1
Used(RandWord) = True
Cells(NumChosen, 2) = Words(RandWord, 1)
End If
Wend
End Sub
Here is the code. I am deleting the cell after using it. Please make a backup of your data before using this as it will delete the cell contents (it will not save automatically...but just in case). You need to run the 'main' sub to get the output.
Sub main()
Dim i As Integer
'as you have put 15 in your question, i am using 15 here. Change it as per your need.
For i = 15 To 1 Step -1
'putting the value of the function in column b (upwards)
Sheets(1).Cells(i, 2).Value = GetText(i)
Next
End Sub
Function GetText(noofrows As Integer)
'if noofrows is 1, the rand function wont work
If noofrows > 1 Then
Dim GivenWords
Dim rowused As Integer
GivenWords = Sheets(1).Range(Sheets(1).Range("A1"), Sheets(1).Range("A" & noofrows))
'getting the randbetween value to a variable bcause after taking the value, we can delete the cell.
rowused = (Application.RandBetween(1, UBound(GivenWords)))
GetText = Sheets(1).Range("A" & rowused)
Application.DisplayAlerts = False
'deleting the cell as we have used it and the function should not use it again
Sheets(1).Cells(rowused, 1).Delete (xlUp)
Application.DisplayAlerts = True
Else
'if noofrows is 1, there is only one value left. so we just use it.
GetText = Sheets(1).Range("A1").Value
Sheets(1).Cells(1, 1).Delete (xlUp)
End If
End Function
Hope this helps.

how to iterate through cells in MIN/MAX function in VB excel 2003?

Set min=min1=1000, max=max1=position=0
For i=2 to 10 do
min=**MIN(A(i,j):A(i+5,j));**
if position=0 then min1=min, position=1 else
For j=2 to 10 do {max=**MAX(A(i,j):A(i+5,j));**
if max<min then next j else position=0, next i
I am new at this and trying to do next code above in VB excel(problem is bolded):Thanks
It is impossible to correct all your code because it is not clear what you are trying to do. But the following might give you a start.
It is always best to declare your variables and to specify their type:
Dim i As Integer
Dim j As Integer
Dim max As Integer
Dim max1 As Integer
Dim min As Integer
Dim min1 As Integer
Dim Position As Integer
I prefer to declare them in alphabetic order but that is not a requirement.
You need:
min = 1000
min1 = 1000
max = 0
max1 = 0
Position = 0
There are the following mistakes in Set min=min1=1000, max=max1=position=0:
Set is only used for objects.
You cannot separate statements by commas.
In some languages min=min1=1000 means min1=1000, min=min1 but in VBA it means:
If min1=1000 Then
min=True
Else
min=False
End if
There is no Do at the end of a For statement. So:
For i = 2 To 10
MIN and MAX are worksheet functions. To use then in VBA you have to say they are worksheet functions. There are no semicolons at the end of statements in VBA. You have used j in the MIN function but have not set its value yet.
I cannot tell the location of the values that are parameters to MIN and MAX.
If the location is in the current worksheet you need something like:
With ActiveSheet
min = Application.WorksheetFunction.min(.Range("B16:F16"))
End With
or perhaps like:
With ActiveSheet
min = Application.WorksheetFunction.min(.Range(.Cells(i,j),.Cells(i+5,j)))
End With
It is possible, to use the MIN and MAX functions on an array but I know of no way of selecting a portion of an array.
Hope this gives you a start.

Resources