SUMPRODUCT ( 1/COUNTIF( range, criteria)). WHY DOES THIS WORK - excel

I need to count how many different values are in a range. I got the answer by using SUMPRODUCT(1/COUNTIF(A2:A37,A2:37)), however, I don't understand the formula, can someone please help me explain?
If I do the COUNTIF separately, the result is 0? How does SUMPRODUCT(1/COUNTIF) help? Also, inside the COUNTIF, the range and criteria are the same, what does this mean? I understand that the range is where we look for, and the criteria is for what criteria, but if the criteria is the entire range, how are we specifying what we're looking for here? How does this translate/work?
Here my sample input:
enter image description here

Just extend some of the comments provided in your question for a better understanding. Here: Excel dynamic arrays, functions and formulas you can find a detail explanation about dynamic arrays and implicit interception and how it has changed through Excel evolution.
SUMPRODUCT (array1, [array2], ...) can be used with a single argument, in such case it sums all elements in array1, i.e. similar to SUM function (Excel SUMPRODUCT Function). For example: SUMPRODUCT({1;2;3}) = 6.
Note: Multiplication is the default operation of SUMPRODUCT, but you can use others (check this link) replacing the , with the corresponding operation. For example: SUMPRODUCT({1;2;3}/({1;2;3}))=3 (acts as COUNT), i.e. 1/1+2/2+3/3 but SUMPRODUCT({1;2;3},{1;2;3})=14, i.e. 1*1+2*2+3*3.
COUNTIF(range, criteria) the first argument has to be a range, but the second argument can be a number, expression, cell reference, or text string that determines which cells will be counted, but it also can be a range or an array (Excel COUNTIF function). If that is the case, COUNTIF is invoked for each element of criteria and returns an array of the same size and shape as criteria. For example if the following range A1:A4 has the following values:
|1|
|2|
|1|
|1|
the following expressions will return:
COUNTIF(A1:A4, 1) = 3
COUNTIF(A1:A4, A1:A2) = {3;1} i.e. 2x1 array
COUNTIF(A1:A4, {1;2}) = {3;1}
COUNTIF(A1:A4, {1,2}) = {3,1} i.e. 1x2 array
and for a two dimensional criteria {1,2;0,1} a 2x2 array:
=COUNTIF(A1:A4, {1,2;0,1}) = ={3,1;0,3} i.e. 2x2 array
Note: Remember you can not invoke the function with the first input argument as an array, for example: COUNTIF({1;2;1;1}, 1) returns an error. It has to be a range.
so we can count how many times each element of the first input argument is repeated like this:
COUNTIF(A1:A4,A1:A4) = {3;1;3;3} i.e. 4x1 array
Now back to SUMPRODUCT (remember with a single argument it is just the sum of the elements):
SUMPRODUCT(COUNTIF(A1:A4,A1:A4))
= SUMPRODUCT({3;1;3;3})
= 10
i.e. sum of occurrences of each element of A1:A4
and finally:
SUMPRODUCT(1/COUNTIF(A1:A4,A1:A4))
= SUMPRODUCT({1/3;1/1;1/3;1/3})
= 3*(1/3) + 1
= 2
which results in total total number of unique elements in A1:A4. As #Harun24hr pointed out, it can be achieved in Microsoft 365 with: COUNTA(UNIQUE(A1:A4) or also COUNT(UNIQUE(A1:A4) since in the example of this answer A1:A4 are numbers, but in the sample of your question (letters) you have to use: COUNTA.

Related

How to select a column IF [duplicate]

Is there a formula that returns a value from the first line matching two or more criteria? For example, "return column C from the first line where column A = x AND column B = y". I'd like to do it without concatenating column A and column B.
Thanks.
True = 1, False = 0
D1 returns 0 because 0 * 1 * 8 = 0
D2 returns 9 because 1 * 1 * 9= 9
This should let you change the criteria:
I use INDEX/MATCH for this. Ex:
I have a table of data and want to return the value in column C where the value in column A is "c" and the value in column B is "h".
I would use the following array formula:
=INDEX($C$1:$C$5,MATCH(1,(($A$1:$A$5="c")*($B$1:$B$5="h")),0))
Commit the formula by pressing Ctrl+Shift+Enter
After entering the formula, you can use Excel's formula auditing tools to step through the evaluation to see how it calculates.
SUMPRODUCT definitely has value when the sum over multiple criteria matches is needed. But the way I read your question, you want something like VLOOKUP that returns the first match. Try this:
For your convenience the formula in G2 is as follows -- requires array entry (Ctrl+Shift+Enter)
[edit: I updated the formula here but not in the screen shot]
=INDEX($C$1:$C$6,MATCH(E2&"|"&F2,$A$1:$A$6&"|"&$B$1:$B$6,0))
Two things to note:
SUMPRODUCT won't work if the result type is not numeric
SUMPRODUCT will return the SUM of results matching the criteria, not the first match (as VLOOKUP does)
Apparently you can use the SUMPRODUCT function.
Actually, I think what he is asking is typical multiple results display option in excel. It can be done using Small, and row function in arrays.
This display all the results that matches the different criteria
Here is an answer that shows how to do this using SUMPRODUCT and table header lookups. The main advantage to this: it works with any value, numeric or otherwise.
So let's say we have headers H1, H2 and H3 on some table called MyTable. And let's say we are entering this into row 1, possibly on another sheet. And we want to match H1, H2 to x, y on that sheet, respectively, while returning the matching value in H3. Then the formula would be as follows:
=INDEX(MyTable[H3], ROUND(SUMPRODUCT(MATCH(TRUE, (MyTable[H1] & MyTable[H2]) = ($x1 & $y1),0)),0),1)
What does it do? The sum-product ensures everything is treated as arrays. So you can contatenate entire table columns together to make an array of concatenated valued, dynamically calculated. And then you can compare these to the existing values in x and y- somehow magically you can compare the concatenated array from the table to the individual concatenation of x & y. Which gives you an array of true false values. Matching that to true yields the first match of the lookup. And then all we need to do is go back and index that in the original table.
Notes
The rounding is just in there to make sure the Index function gets back an integer. I got #N/A values until I rounded.
It might be more instructive to run this through the evaluator to see what's going on...
This can easily be modified to work with a non table - just replace the table references with raw ranges. The tables are clearer though, so use them if possible. I found the original source for this here: http://dailydoseofexcel.com/archives/2009/04/21/vlookup-on-two-columns/. But there was a bug with rouding values to INTs so I fixed that.

INDEX/MATCH with 4 columns

I have an Excel file with 2 sheets - one sheet contains my items, prices, codes, etc. and the other sheet is for cross-matching with competitors.
I've included an Excel file and image below.
I want to be able to generate my code automatically when manually entering any of my competitor's codes. I was able to do INDEX/MATCH but I was only able to match with one column (I'm assuming they're all in one sheet to make it easier). Here is my formula:
=INDEX(C:C,MATCH(K2,E:E,0)
So this is looking only in E:E, when I tried to enter a different column such as C:C or D:D it returns an error.
I tried to do the MATCH as C:G but it gave an error right away.
The reason why match gave you error is because it's looking for an array and you put in multiple columns.
There is definitely a more elegant way to do this but this is the first one that I came up with.
=IFERROR(INDEX(B:B,MATCH(K2,C:C,0)),IFERROR(INDEX(B:B,MATCH(K2,D:D,0)),IFERROR(INDEX(B:B,MATCH(K2,E:E,0)),IFERROR(INDEX(B:B,MATCH(K2,F:F,0)),IFERROR(INDEX(B:B,MATCH(K2,G:G,0)),"")))))
Index/Match Combination
Please try this formula:
{=INDEX($B$2:$B$5,MATCH(1,(K2=$C$2:$C$5)+(K2=$D$2:$D$5)+(K2=$E$2:$E$5)+(K2=$F$2:$F$5)+(K2=$G$2:$G$5),0))}
Instruction: Paste the formula {without the curly brackets} to the formula bar and hit CTRL+SHIFT+ENTER while the cell is still active. This will create an array formula. Hence, the curly brackets. Please take note though that manually entering the curly brackets will not work.
Description:
The INDEX function returns a value or the reference to a value from within a table or range.1
The MATCH function searches for a specified item in a range of cells, and then returns the relative position of that item in the range.2
Syntax:
The INDEX function has two forms—Array and Reference form. We're going use the Reference form in this case.
INDEX(reference, row_num, [column_num], [area_num])1
MATCH(lookup_value, lookup_array, [match_type])2
Explanation:
To simplify, we're going to use this form:
INDEX(reference, MATCH(lookup_value, lookup_array, [match_type]))
The INDEX function returns a value from the reference My code column (B1:B5) based on the row_num argument, which serves as an index number to point to the right cell, and we're going to do that by substituting row_num with MATCH function.
MATCH function, on the other hand, returns the relative position of a value in competitorn column that matches the value in individual cells of the competitor code column.
To make it work with multiple lookup range, we're going to create arrays of boolean values (TRUE/FALSE, aka logical values) by comparing values from individual cells in competitor code column with values in individual competitorn columns. Now, we convert these boolean values into numerical values by performing a mathematical operation that does not alter its implied value (i.e. TRUE=1, FALSE=0). We're going to add these values directly to make it simple. The resulting array have four index with two possible values: 1 or 0. Since each item in MATCH's lookup_array is unique, then there can be only one TRUE or 1. The rest are FALSE or 0's. So, with that knowledge, we're going to use it as our lookup_value.
Let's dissect the formula:
=INDEX(B2:B5,MATCH(1,(K2=C2:C5)+(K2=D2:D5)+(K2=E2:E5)+(K2=F2:F5)+(K2=G2:G5),0))
My code 2 = INDEX({"My code 1";"My code 2";"My code 3";"My code 4"},MATCH)
My code 2 = INDEX({"My code 1";"My code 2";"My code 3";"My code 4"},(2))
2 = MATCH(1,(K2=C2:C5)+(K2=D2:D5)+(K2=E2:E5)+(K2=F2:F5)+(K2=G2:G5),0)
2 =MATCH(1,
{FALSE;FALSE;FALSE;FALSE}+
{FALSE;FALSE;FALSE;FALSE}+
{FALSE;FALSE;FALSE;FALSE}+
{FALSE;FALSE;FALSE;FALSE}+
{FALSE;TRUE;FALSE;FALSE},0))
OR
=MATCH(1,
{0;0;0;0}+
{0;0;0;0}+
{0;0;0;0}+
{0;0;0;0}+
{0;1;0;0},0))
=========
{0;1;0;0},0))
2 = MATCH(1,{0;1;0;0},0))
I hope this answer is helpful.
References and links:
INDEX function
MATCH function
Create an array formula

What is the principle of sumproduct function in Excel?

This function can also calculate something like the result of sumifs if I put an array into the criteria argument: =sumproduct(sumifs(sumrange, criteria range, criteria)). It gives me the sum of the sum of all the criterion in the array. If I just do =sumifs(sumrange,criteria range, criteria) and the criterion is an array, then the result would be zero. I could only check the result by pressing F9 and it gives me the sum of each criteria separately. Why can sumproduct function add up all the separate values here?
When you use this formula
=SUMIFS(sumrange,criteria range, criteria)
with a range in place of the criteria then the result is an array of values
To get the result you want you can enclose that in SUM function......but then the formula needs to be "array entered" with CTRL+SHIFT+ENTER.....so using SUMPRODUCT is a way to avoid array entry. Because the result of SUMIFS is a single array SUMPRODUCT has nothing to multiply so it just sums the array
In short, when fed a single array SUMPRODUCT just sums the contents, e.g.
=SUMPRODUCT({1,2,3}) = 6
SumProduct is a very particular formula and you'll see it used in many occasions as a workaround to using array formulas, but only when it's used as a substitute of a point-array formulas (array formulas entered into a single cell), because it works with arrays by default.
In your particular case, you seem to get 0 with SUM(SUMIFS) as pure coincidence, because the sum of elements that are true for the first element of your array criterion is 0.
If you have, for example, a table as such:
A B C
1 t t
2 f f
3 t
4 x
5 x
and use the formula:
=SUM(SUMIFS($A$1:$A$5,$B$1:$B$5,$C$1:$C$2))
You will get 4 as a result, because it will only evaluate the first element of your condition array. If you instead enter it as an array formula with Ctrl + Shift + Enter, it will evaluate the condition with all the elements of your conditional array, working as such:
Evaluate =SUMIFS({1,2,3,4,5},{t,f,t,x,x},{t})
Return =SUM({1,2,3,4,5}*{1,0,1,0,0}) => SUM({1,0,3,0,0}) => 4
Store 4 in the first position of the temporary array {4,NULL}
Evaluate =SUMIFS({1,2,3,4,5},{t,f,t,x,x},{f})
Return =SUM({1,2,3,4,5}*{0,1,0,0,0}) => SUM({0,2,0,0,0}) => 2
Store 2 in the second position of the temporary array {4,2}
Evaluate =SUM({4,2})
Return 6
Which is pretty much the exact same thing the SUMPRODUCT formula would do.
Let me know if there's any further clarification needed.

Function to count all rows with multiple criteria

I'm trying to use the COUNTIFS statement to count all rows where values in 4 different columns equal "something", while excluding rows where the values in two columns are equal. This is what I have for counting the rows where the 4 columns equal "something" but I can't figure out how to add the last part:
=COUNTIFS(A2:A100,"something",B2:B100,"something",C2:C100,"something",D2:D100,"something", [...])
Now I need to add another statement within this COUNTIFS at the [...] that says something like "exclude all rows where value in J is equal to value in K", but I can't seem to figure out how to do that WITHIN the COUNTIFS statement.
You likely have to move to a SUMPRODUCT function.
=SUMPRODUCT((A2:A100="something")*(B2:B100="something")*(C2:C100="something")*(D2:D100="something")*(J2:J100<>K2:K100))
Avoid full column references in SUMPRODUCT due to the cyclic nature of the calculation.
You could use an Array function for this.
={sum((A1:A1000 = 'Something')*(BB:B1000 = 'Something')*(C1:C1000 = 'Something')*(D1:D1000 = 'Something')*(J1:J1000 = K1=K1000))}
for entering an array function, you need to use Ctrl+Shift+Enter
More info at Excel Array Functions

VLOOKUP with two criteria?

Is there a formula that returns a value from the first line matching two or more criteria? For example, "return column C from the first line where column A = x AND column B = y". I'd like to do it without concatenating column A and column B.
Thanks.
True = 1, False = 0
D1 returns 0 because 0 * 1 * 8 = 0
D2 returns 9 because 1 * 1 * 9= 9
This should let you change the criteria:
I use INDEX/MATCH for this. Ex:
I have a table of data and want to return the value in column C where the value in column A is "c" and the value in column B is "h".
I would use the following array formula:
=INDEX($C$1:$C$5,MATCH(1,(($A$1:$A$5="c")*($B$1:$B$5="h")),0))
Commit the formula by pressing Ctrl+Shift+Enter
After entering the formula, you can use Excel's formula auditing tools to step through the evaluation to see how it calculates.
SUMPRODUCT definitely has value when the sum over multiple criteria matches is needed. But the way I read your question, you want something like VLOOKUP that returns the first match. Try this:
For your convenience the formula in G2 is as follows -- requires array entry (Ctrl+Shift+Enter)
[edit: I updated the formula here but not in the screen shot]
=INDEX($C$1:$C$6,MATCH(E2&"|"&F2,$A$1:$A$6&"|"&$B$1:$B$6,0))
Two things to note:
SUMPRODUCT won't work if the result type is not numeric
SUMPRODUCT will return the SUM of results matching the criteria, not the first match (as VLOOKUP does)
Apparently you can use the SUMPRODUCT function.
Actually, I think what he is asking is typical multiple results display option in excel. It can be done using Small, and row function in arrays.
This display all the results that matches the different criteria
Here is an answer that shows how to do this using SUMPRODUCT and table header lookups. The main advantage to this: it works with any value, numeric or otherwise.
So let's say we have headers H1, H2 and H3 on some table called MyTable. And let's say we are entering this into row 1, possibly on another sheet. And we want to match H1, H2 to x, y on that sheet, respectively, while returning the matching value in H3. Then the formula would be as follows:
=INDEX(MyTable[H3], ROUND(SUMPRODUCT(MATCH(TRUE, (MyTable[H1] & MyTable[H2]) = ($x1 & $y1),0)),0),1)
What does it do? The sum-product ensures everything is treated as arrays. So you can contatenate entire table columns together to make an array of concatenated valued, dynamically calculated. And then you can compare these to the existing values in x and y- somehow magically you can compare the concatenated array from the table to the individual concatenation of x & y. Which gives you an array of true false values. Matching that to true yields the first match of the lookup. And then all we need to do is go back and index that in the original table.
Notes
The rounding is just in there to make sure the Index function gets back an integer. I got #N/A values until I rounded.
It might be more instructive to run this through the evaluator to see what's going on...
This can easily be modified to work with a non table - just replace the table references with raw ranges. The tables are clearer though, so use them if possible. I found the original source for this here: http://dailydoseofexcel.com/archives/2009/04/21/vlookup-on-two-columns/. But there was a bug with rouding values to INTs so I fixed that.

Resources