Excel COUNTIF with strings, separated by comma - excel

I am trying to write a simple formula to count how many times a particular name appears in a column. I am using COUNTIF as it is a pretty straight forward process but I cannot work out how to make it happen for a name in particular. This is the case:
The column named Age will display cells with one or more names, separated by commas in case there are more than one value. Putting "Young" as an example is easy to tell the COUNTIF formula to give me the number representing how many times this word appears, either being the only value in cell or as a part of a cell with a longer string value by giving the formula the "Young" statement.
The problem comes when I want the formula to count how many times "Mature" appears in my column. I cannot work out the way to make it count only when it says "Mature" without also taking all the "Early_Mature" or "Semi_Mature"
I know this is easy for whoever knows the basics of Excel so I don't think there is need to give more details.
Thanks

Most of the times I succeed solving such problems by adding the same delimiter (of our string) at the beginning and end of the main string.
So since your data is at COL:Y, you may create a new helper COL:Z and enter this formula:
="," & Y1 & ","
I did not use any spaces before or after comma since your data seems not having any space. Depending on your case, you may have to use spaces.
Now your string is wrapped with commas, which you may alter COUNTIF formula to such:
=COUNTIF(Z:Z,"*,"&B1&",*")
* characters are jokers which stand for "anything" in this context.

With an UDF. Code goes in a standard module added by opening the VBE with Alt + F11 then right-click in project explorer and add module.
Code
Option Explicit
Public Function GetCount(ByRef selectRange As Range, ByVal searchTerm As String) As Long
Application.Volatile
With selectRange
Dim arr(), joinedString As String, i As Long, outputCount As Long
arr = .Value
joinedString = Join(Application.WorksheetFunction.Transpose(Application.WorksheetFunction.Index(arr, 0, 1)), ",")
Dim arr2() As String
arr2 = Split(joinedString, ",")
For i = LBound(arr2) To UBound(arr2)
If Trim$(arr2(i)) = "Mature" Then
outputCount = outputCount + 1
End If
Next i
End With
GetCount = outputCount
End Function
Usage in sheet

To get the number of occurrences of Mature excluding those that have prefix you can use this array formula:
=SUM(((LEN(A2:A7)-LEN(SUBSTITUTE(A2:A7,"Mature",""))) / LEN("Mature"))-((LEN(A2:A7)-LEN(SUBSTITUTE(A2:A7,"_Mature",""))) / LEN("_Mature")))
Please take note that this formula is applied with Ctrl + Shift + Enter.
Given that your range is in Y:Y column, just change the range to one you need.

An alternative would be to change "Mature" to "Fully_Mature". Then you could just use Countif().
You'd have to do this in steps:
1) Change "Early_Mature" to "E_M"
2) Change "Semi_Mature" to "S_M"
3) Change "Mature" to "Fully_Mature"
4) reverse of step 1).
5) reverse of step 2).

Related

Using COUNTIF across a range rather than a single column/row

I'm trying to count the number of times a word appears in a range of cells using COUNTIF.
The formula I have tried is =COUNTIF($A$2:$T$9,C7)
Which is incorrect, adolescence appears 4 times across my data set. The strange thing is I can see that correct result if I use the formula builder/inserter to check the formula:
Everything I've looked at so far has pointed me towards array functions (or Control-Shift-Enter) but this doesn't work either.
What exactly is happening in the 'Insert Function' box that's not happening in the formula bar?
When you first entered that equation, you almost certainly saw a circular reference warning. And, even if you ignored it, you should look at the bottom left where you'll probably see the following helpful indicator:
With circular references, Excel is very careful not to get caught in an infinite loop as it tries to follow all the dependency chains.
In the case where you have circular dependencies between each of a great many cells (as your case does), this escalates very quickly and I'd be surprised if Excel didn't just berate you and exit in protest. Or, more likely, it just sets them to zero since it warned you and you chose to ignore it :-)
The most likely reason it works in the dialog box is because that's not actually a cell that would cause a circular reference. It's not until the formula is placed into a cell does that occur.
The solution, of course, is to get rid of the circular dependencies, by removing the count columns from the lookups used by countif.
Probably the simplest way to do that (if you want to stick with built-in functions) is to make the cells work on just the theme columns explicitly, with a formula like (in b2):
=countif($a$2:$a$9,a2) + countif($c$2:$c$9,a2) + countif($e$2:$e$9,a2) + countif($g$2:$g$9,c2)
I've only gone up to column g since I used your image as a test case, you'll obviously need to expand that to use all your columns, { a, c, e, g, i, k, m, o, q, s }.
Admittedly, that's a rather painful formula but you only need type it in once (in b2) then copy and paste to cells b3:b9, d2:d9, up to t2:t9.
Alternatively, you can use a combination of indirect, countif, and sum to achieve the same result with a shorter formula (again, expanding out to use all the individual column ranges up to s):
=sum(countif(indirect({"$a$2:$a$9","$c$2:$c$9","$e$2:$e$9","$g$2:$g$9"}),b2))
The next step beyond that is a user-defined function (UDF) that can do the heavy lifting for you. Opening up the VBA editor, you can create a module for your workbook (if one does not already exist), and enter the following UDF:
Function HowManyOf(lookFor, firstCell, lastCell, colSkip, rowSkip)
' What we are looking for.
needVal = lookFor.Value
' Get cells.
startCol = firstCell.Column
startRow = firstCell.Row
endCol = lastCell.Column
endRow = lastCell.Row
' Ensure top left to bottom right, and sane skips.
If startCol > endCol Then
temp = startCol
startCol = endCol
endCol = temp
End If
If startRow > endRow Then
temp = startRow
startRow = endRow
endRow = temp
End If
If colSkip < 0 Then colSkip = -colSkip
If colSkip = 0 Then colSkip = 1
If rowSkip < 0 Then rowSkip = -rowSkip
If rowSkip = 0 Then rowSkip = 1
' Process each column.
HowManyOf = 0
For thisCol = startCol To endCol Step colSkip
' Process row within column.
For thisRow = startRow To endRow Step rowSkip
If Cells(thisRow, thisCol).Value = needVal Then
HowManyOf = HowManyOf + 1
End If
Next
Next
End Function
Then you can simply enter the formula (again, start in b2):
' Args are:
' The cell with the thing you want to count.
' One corner of the range.
' The opposite corner of the range.
' Column skip.
' Row skip.
' Corners can be any corner as long as they're opposite.
' Protected against negative and zero skips.
=howmanyof(a2, $a$2, $h$9, 2, 1)
Then, copying that formula into all the other cells will give you what you want:
Alternatively, instead, using of many COUNIF() you can use FILTER() function with few other function to make it workable and avoid circular reference.
=SUM(--(FILTER($A$2:$T$9,MOD(SEQUENCE(,COLUMNS($A$2:$T$9)),2))=A2))
You can make it dynamic array to spill results automatically can use BYROW() lambda function like-
=BYROW(A2:A9,LAMBDA(x,SUM(--(FILTER($A$2:$T$9,MOD(SEQUENCE(,COLUMNS($A$2:$T$9)),2))=x))))
If the COUNTIF() is outside the range then it counts correctly and will show so in the formula builder BUT it will show 0 if there is some other COUNTIF() which causes a Circular Reference. Only if all Circular References are removed (other COUNTIF()s within the range for example) then it will show the count correctly. As an alternative to check the formula builder you could switch to Workbook Calculation Manual and calculate just this one cell using F2 to see the correct result.

Using MIN/MAX in excel array formula

I have a simple array formula in excel that doesn't work in the way I wish. In columns A and B there is data (A1 is paired with B1 and so on) while in column F there is the calculation based on the parameter in column E.
In cell F1 the formula is:
{=SUM(MAX(A$1:A$9, E1)*B$1:B$9)}
What this formula does is:
=MAX(A$1:A$9, E1)*B$1 + MAX(A$1:A$9, E1)*B$2 + ...
Instead, I need a formula that does this:
=MAX(A$1, E1)*B$1 + MAX(A$2, E1)*B$2 + ...
In words, the formula I wrote (the first one) always finds the max between the values from A1 to A9 and E1, multiplies it by the i-th B value and sums the results. What I need is a formula that finds the max between the i-th A value and E1, and not between all the A values.
What I'm looking for is easily done by adding in column C the formula =MAX(A1;E$1)*B1 and then in F1 just =SUM(A1:A9), but I can't use this solution because in column F the same formula is repeated, with the E parameter changing every time.
I can use a IF instruction: in F1 I can write
{=SUM(IF(A$1:A$9>E1, A$1:A$9, E1)*B$1:B$9)}
While this formula does what I need in this case, I think it's a bad solution because I find it difficult to read and to expand. For example, if there is another parameter in column D and the factor is MIN(MAX(A$1:A$9;E1);D1), using IF will result in a very long and very unreadable and complicated formula.
Are there better solutions to my problem? Thank you all!
NOTE: syntax may vary a little because I am using the italian version of excel.
The problem is that MAX takes an array as an argument. Functions that normally take an array never return an array - they were designed to turn an array into one number. No matter how many arrays you throw at MAX, it's always just going to return one number.
I couldn't come up with a good solution, so here's a bad one
=SUMPRODUCT(((A1:A9*(A1:A9>E1))+(E1*(A1:A9<=E1)))*B1:B9)
I don't think that really increases the maintainability of the IF-based formula that you're trying to avoid. I think you're stuck with IF or a helper column.
Another possibility is a VBA function.
Public Function SumMaxMin(rRng1 As Range, rRng2 As Range, ParamArray vaMinMax() As Variant) As Double
Dim rCell As Range
Dim dReturn As Double
Dim aMult() As Double
Dim lCnt As Long
Dim i As Long
ReDim aMult(1 To rRng1.Cells.Count)
For Each rCell In rRng1.Cells
lCnt = lCnt + 1
aMult(lCnt) = rCell.Value
For i = LBound(vaMinMax) To UBound(vaMinMax) Step 2
If Not Evaluate(aMult(lCnt) & vaMinMax(i + 1) & vaMinMax(i)) Then
aMult(lCnt) = vaMinMax(i)
End If
Next i
Next rCell
For i = LBound(aMult) To UBound(aMult)
dReturn = dReturn + aMult(i) * rRng2.Cells(i).Value
Next i
SumMaxMin = dReturn
End Function
For your example
=SumMaxMin(A1:A9,B1:B9,E1,">")
Adding another condition
=SumMaxMin(A1:A9,B1:B9,E1,">",D1,"<")
It will error if your ranges aren't the same number of cells or you pass arguments that don't work with Evaluate.
Another possibility for avoiding repetitions of cell references is:
=SUM(B1:B9*ABS(A1:A9-E1*{1,-1}))/2
assuming values are non-negative. More generally to return an array of pairwise max values:
=MMULT((A1:A9-E1*{1,-1})^{2,1}^{0.5,1},{1;1}/2)
which returns:
MAX(A1,E1)
MAX(A2,E1)
...
MAX(A9,E1)
I don't remember ever cracking this problem, but for maintainability I'd probably do something like this:
{=SUM((A1:A9<E1)*E1*B$1:B$9) + SUM((A1:A9>=E1)*A1:A9*B$1:B$9)}
If I understand the problem correctly, using IF instead of MAX should do:
=SUM(IF($A$1:$A$9>E1;$A$1:$A$9;E1)*$B$1:$B$9)

Can MATCH function in an array formula to return multiple matches?

I tried to use the MATCH function in an array formula to return multiple matches (by default it only returns the first match). However, this doesn't seem to work. How can I solve this problem without a complex, unreadable formula?
How about this, without VBA? [entered on cell C9 as an array formula with CTRL + SHIFT + ENTER, where your searched column is A9:A24, and your search terms are in B1:B4], and dragged down to find multiple hits?
=SMALL(IFERROR(MATCH($B$1:$B$4,$A$9:$A$24,0),""),ROW()-ROW($C$8))
This first uses the array formula to show each 'hit' for any of the search terms matched in the searched column, and then using the Small function with reference to the current cell's row, it returns the earliest hit, then the 2nd hit, then the 3rd hit, etc.
Beyond this point, the reference points to the searched array can be used as needed (converted to the row location of an index function, etc.).
EDIT
On further review of the results from this formula, it only returns a single hit for each search term, even if that search term appears multiple times. To resolve this, I first used the formula:
=SMALL(IF($A$9:$A$24=$B$1,ROW($A$9:$A$24),""),ROW()-ROW($E$8))
This shows each hit for a match of the search term found in B1. Here is where I am stuck. I could only figure out how to resolve with the admittedly manual:
=SMALL(IF($A$9:$A$24={"a","b","c"},ROW($A$9:$A$24),""),ROW()-ROW($E$8))
Any suggestions on how to improve to allow multiple hits for multiple terms?
EDIT - Additional option
Okay, I've determined another method of picking up multiple hits. This one relies on considering the location of the previous matches already made. Depending on what you want your result vector to look like (which was never specified by the OP), the results from this are clean but the formula is fairly messy.
The first cell looks like this, in cell H9:
=ADDRESS(MIN(IFERROR(MATCH($B$1:$B$4,$A$9:$A$24,0),""))+ROW($A$8),1)
This shows the address of the first cell which matches any of the search terms, using the formula noted further above.
The cell below that (and every cell after that), has this (also an array formula):
=ADDRESS(MIN(IFERROR(MATCH($B$1:$B$4,INDIRECT(ADDRESS(ROW(INDIRECT(H9))+1,1)):$A$25,0),""))+ROW(INDIRECT(H9)),1)
This picks up the address of the cell found in the row above (adding 1 row to avoid re-hitting the same term), and from that new search column from that point to the end point (adding 1 row so that it properly stops at the last ending hit), it re-searches for any of the terms.
This one is again, not that clean [Yes I know there are some improvements I could make to determining what the search should be - either using the text manipulation functions or even doing a relative name reference that changes as you move down the column], but it is automated and, I would argue, cleaner than a VBA module. Especially as, depending on what you want your result vector to be, this could be much simpler.
Working\developing on the formulas posted by #Grade'Eh'Bacon ended up with this formula to retrieve all the results of a match function with several matches for several items.
Assuming input range is B2:B17 and the range with the items to match is F3:F5 enter this FormulaArray in H3
=IFERROR( SMALL( IF( $B$3:$B$17 = TRANSPOSE( $F$3:$F$5 ),
1 + ROW( $B$3:$B$17 ) - ROW( $B$3 ), "" ), ROWS($2:2 ) ), "" )
It's an FormulaArray returning all matches for several items
All merits go to #Grade'Eh'Bacon for his great work on the subject.
It is not possible with the built-in MATCH, however, using a VBA macro, you can achieve this:
Public Function MATCH_RANGE(values As Variant, ary As Variant, match_type As Integer)
Dim i As Integer
Dim elementCount As Integer
Dim result()
Dim value As Variant
Dim arySize As Integer
arySize = UBound(ary.Value2, 1)
Dim valueSize As Integer
valueSize = UBound(values.Value2, 1)
ReDim result(0 To arySize, 0 To 1)
elementCount = 0
For i = 1 To arySize
For j = 1 To valueSize
value = values(j, 1)
If (match_type = -1 And ary(i, 1) <= value) Or (match_type = 0 And ary(i, 1) = value) Or (match_type = 1 And ary(i, 1) >= value) Then
result(elementCount, 0) = i
elementCount = elementCount + 1
End If
Next j
Next i
For i = elementCount To arySize
result(i, 0) = -100000000
Next i
MATCH_RANGE = result
End Function
This function both returns multiple matches and allows you to pass a range of multiple values that you want matched. I've found this useful a number of times. Feedback welcome to help improve this.
NOTE: You must spread this formula across a few cells using an array-formula (CRTL-SHIFT-ENTER), in order to see the multiple matches.

Calculate SUM without Adding Values to Rows/Columns?

When calculating series in Excel, most tutorials begin by setting sequence values to certain range of cells, say
A1=1, A2=2, A3=3,..., A10=10
and to get the value of 1+2+...+10, execute
A11=SUM(A1:A10)
But I don't want the "generate the sequence in worksheet cells first" part because initially I don't know the 'n' (10 in the above) and I want to define a custom function that takes n as a function argument.
So, is there a way to do something like this?
B1 = SUM([1:10]) // adding array of 'constants', not cell reference
EDIT: If I could 'summon' some (array of) big number(s) without any cell/ROW/COL operation as in calling rand(), that would be great.
Try using Array Formula as below
=SUM(ROW(A1:A10)) and then press CTRL+SHIFT+ENTER
Row(A1:A10) will become {1,2,3,4,5,6,7,8,9,10}.
Usage:
If you want to sum cells A20 to A50
sumjeff("A", 20,50)
Code
Function sumJeff(letter As String, nFrom As Integer, nTo As Integer) As Double
Dim strAddress As String
strAddress = letter & nFrom & ":" & letter & nTo
sumJeff = Application.WorksheetFunction.Sum(Range(strAddress))
End Function

vector element operations in excel

I am trying to perform the following in one steps (one formula):
Strip a letter from a column of elements and add them up.
Example:
Data:
1x
2y
3x
I want to strip letters and add up numbers all in one formula.
I understand that I could have a helper column in which I strip letters x,y,z and then have a formula to add up the numbers, but I don't want to do this.
Thanks for any suggestions.
Assuming one entry per cell:
Is there only one letter at the end? If so, you can use:
=SUMPRODUCT(--LEFT(A1:A100,LEN(A1:A100)-1))
If there might be multiple letters at the end, a simple UDF would be simpler:
Option Explicit
Function AddStrings(rg As Range)
Dim L As Long
Dim I As Long
For I = 1 To rg.Count
L = L + Val(rg(I))
Next I
AddStrings = L
End Function
EDIT: If some of the cells might be blank, you can use either the UDF, or, if you prefer the formula, this array-entered formula:
=SUM(IFERROR(--LEFT(A1:A100,LEN(A1:A100)-1),0))
To array-enter a formula, after entering
the formula into the cell or formula bar, hold down
ctrl-shift while hitting enter. If you did this
correctly, Excel will place braces {...} around the formula.
Assuming that the format is consistent, you can do something like
=VALUE(LEFT(A1,1))+VALUE(MID(A1,4,1))+VALUE(MID(A1,7,1))
If the format is not consistent, things get more difficult. Let me know and I will expand the answer.
EDIT:
This function works with a variable length text, assuming that the fields are separated by the spaces and have one letter after the number:
Function AddValues(Text As String)
Dim Tokens() As String, I As Integer
Tokens = Split(Text)
For I = 0 To UBound(Tokens)
AddValues = AddValues + Val(Left(Tokens(I), Len(Tokens(I)) - 1))
Next I
End Function

Resources