Distinct count of values in a column if two other cells match - excel

Struggling a bit with array formula's to count distinct values in one column, in the rows where two other cells match. Sorry if I can't explain any better. Best is to show you the formula I created and provide some sample data:
Sheet 1
Column A Column B
Period 101 Code X
Period 309 Code Y
Period 101 Code Y
Period 101 Code Z
Period 404 Code Y
Period 101 Code X
Period 101 Code X
Period 404 Code X
Period 404 Code Z
Sheet 2
Column A Column B (where the formula should be)
Code X 2
Code Y 3
Code Z 2
Basically I want to count the distinct values in Sheet 1 column A, only where the value in Sheet 1 column B matches the value in sheet 2 column A. I have provided the expected outcome for the three code values.
I have tried with the following formula, but I am unable to count distinct values in another column where the two cells match:
{=SUM(--(FREQUENCY(IF(C5:C11=G5,MATCH(B5:B11,B5:B11,0)),ROW(B5:B11)-ROW(B5)+1)>0))}
Please ignore the rows and columns used in the formula, also the values in Column A and B on sheet 1 both occur multiple times, but the values in column one on sheet 2 only occur once.
I am curious how someone would solve this one. Thank you in advance.

Caveat: This answer is unlikely to be useful to the OP, as these techniques are as yet only available to Excel Insiders
But once these new features are available to the main stream they will be a game changer.
This uses the new Dynamic Array feature coming to Excel soon.
To create the list of unique values from Column B, place this formula in a single cell. Excel will "Spill" into as many rows as needed to return the unique list of values from Column B. For example, I have used cell E2
=UNIQUE(FILTER($B:$B,$B:$B<>""))
Now, place this formula in a the next adjacent cell, I've used F2
=COUNTA(UNIQUE(FILTER($A:$A,$B:$B=$E$2)))
Again, you only need to put this formula in one cell, no need to copy down. Excel will "Spill" the result into as many cells as needed, to match column E.

Your formula doesn't match your sample data but let's assume the below:
Formula in H5:
=SUM(--(FREQUENCY(IF(C$5:C$13=G5,MATCH(B$5:B$13,B$5:B$13,0)),ROW(B$5:B$13)-ROW(B$5)+1)>0))
Entered as array through CtrlShiftEnter and drag down
Notice the semi-absolute cell references (you used relative ones) + how my ranges are larger than yours (you looked from C5:C11 only)

Related

Reference named ranges in external workbook with formula criteria

Need Help on Named Ranges in Formulas:
I have a second workbook ('TEST.xlsx') as the destination, referencing worksheet-scoped named ranges (in 12 columns X 75 rows) in the source workbook ('FLOW.xlsx'). I want to create a formula that will match a look-up value (a date entered into cell C3 in TEST that will return the matching named range IF there are 2 or more blank cells in that matched named range/column and the remaining named ranges/columns in that set of 12 columns with 2+ blank cells. The 12 separate columns in the source workbook ('FLOW') are named by month, year and location (ex., "jan_2019_class.1","feb_2019_class.1", etc.), the worksheet columns being C, H, M, R, W, AB, AG, AL, AQ, AV, BA, and BF. The rows are 80-155. I've only been able to make a simple working COUNTBLANK formula in my TEST workbook, ex.:
=COUNTBLANK('[FLOW.xlsx]Class_1-Chart'!jan_2019_class.1)
But NOT for successive columns (with different named ranges and the columns are non-sequential); and I can't figure out the functioning formula to combine with this to get the count AND data returned by criteria as described above. Please, no VBA/macros.
Thank you in advance for the help!
'TEST.xlsx' Screen Shot-RVSD
FLOW.xlsx- sample screenshot
There are many approaches but I personally prefer the use of helper rows/columns/cells and named ranges.
In my demonstration I used two class attendant schedule in two different year from January to June as shown below (they are sitting in Column C to M in my example):
As shown above, I have added two helper rows on top of each schedule. The first helper row is used to find out if there is 2 or more vacancies in each month, if so returns TRUE. I have given the name check.2019.class.1 and check.2021.class.5 for each of them.
The second helper row is simply showing the range name of each month such as jan_2019_class.1, feb_2019_class.2 etc. I have given the name NameRng.2019.class.1 and NameRng.2021.class.5 for each of them.
On the TEST sheet I have the following set up:
where the look up value in cell C3 is actually returned by a formula so it can be "dynamically" changed by the user. Please note in the following formula I used a name ClassNo which is essentially the value from cell B3.
=B2&"_"&B1&"_class."&ClassNo
I have also named cell C3 as Start_MthYrClass which will be used in my following formula.
The formula for looking up the first available month in 2019 if the start month is jan_2019_class.1 is:
=INDEX(NameRng.2019.class.1,MATCH(1,(TRANSPOSE(ROW($1:$11))>=MATCH(Start_MthYrClass,NameRng.2019.class.1,0))*Check.2019.class.1,0))
Please note it is an array formula so you MUST press Ctrl+Shift+Enter upon finishing the formula in the formula bar otherwise they will not function correctly.
The logic is to first "filter" the range NameRng.2019.class.1 using this formula =TRANSPOSE(ROW($1:$11))>=MATCH(Start_MthYrClass,NameRng.2019.class.1,0), in which ROW($1:$11) represents {1;2;3;4;5;6;7;8;9;10;11} and TRANSPOSE will turn it into {1,2,3,4,5,6,7,8,9,10,11}. This range of numbers represents the column index in that specific range which is Column C to M (in your case it would be ROW($1:$56) as your data is in Column C to BF). Then I use MATCH to return the start column index of the look up month jan_2019_class.1, and it should return 1 as this month starts in the 1st place/column in the range NameRng.2019.class.1. So this is what I am actually comparing: {1,2,3,4,5,6,7,8,9,10,11}>=1, and it will return {TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE}.
Then I multiply the above result with range Check.2019.class.1 which is essentially {FALSE,0,FALSE,0,TRUE,0,FALSE,0,TRUE,0,TRUE}. Then I will get {0,0,0,0,1,0,0,0,1,0,1}. FYI in Excel TRUE=1 and FALSE=0, so TRUE x FALSE = 0 while TRUE x TRUE = 1.
Lastly, I use MATCH to find out the position of the first 1 in the above result which is the 5th place/column, and then use INDEX to return the corresponding value from range NameRng.2019.class.1 which is mar_2019_class.1.
Here is a more universal formula which allows you enter it in the first cell C6 and drag it down to apply across board, if you have given names to the relevant cells and ranges in the same way as what I have demonstrated.
=IFERROR(INDEX(INDIRECT("NameRng."&B6&".class."&ClassNo),MATCH(1,(TRANSPOSE(ROW($1:$11))>=MATCH(Start_MthYrClass,INDIRECT("NameRng."&B6&".class."&ClassNo),0))*INDIRECT("Check."&B6&".class."&ClassNo),0)),"")
It is also an array formula so you MUST press Ctrl+Shift+Enter upon finishing the formula in the formula bar.
It is essentially the same formula as the first one but I have added IFERROR to return a blank cell if there is no match, and I used INDIRECT to refer to the named ranges dynamically based on the year and class number chosen.
Now, if I change the look up criteria to mar_2021_class.5, here is an updated result:
Let me know if you have any questions. Cheers :)

Can i use SUMPRODUCT to ignore blank cells?

My spreadsheet has a list of names (some are repeated) in column A, a list of numbers stored in a string in column B, and column C uses a formula to get the first number of the string in column B. In column E have created a list of the unique names from column A, in column F a number of times they appear in the data list and in column G i then want to fetch the corresponding number data from column C each time it appears in the list to calculate average numbers.
I have tried this
=SUMPRODUCT(($A$1:INDEX($A:$A,COUNTA($A:$A))=$E4)*($C$2:INDEX($C:$C,COUNTA($C:$C))))/$F4
The problem i have is that in the list of data some of the cells in column C are blank so i am getting a #VALUE error.
Here is a screenshot of what i am trying:
Is there anyway to tell SUMPRODUCT to skip the rows where there is no number data?
Obviously this is just an example and my actual spreadsheet is a little more complicated, there are thousands of rows of data and the names are repeated many times over.
Empty cells are not your problem. It would just be accepted in a formula like yours. Unfortunately the problem is because you have gaps, COUNTA will return a range that's not equal to column A > COUNTA in column A will return 15, whereas COUNTA in column C will return 11. Unequal ranges will return #VALUE
In this specific case your issue is resolved through:
=SUMPRODUCT(($A$1:INDEX($A:$A,COUNTA($A:$A))=$E4)*($C$1:INDEX($C:$C,COUNTA($A:$A))))/$F4
In G4, copied down :
=SUMIF($A:$A,$E4,$C:$C)/$F4
Edit : SUMIF() can use whole column reference of which bounded on used range only, and can avoid to use dynamic range.

Excel get value from another sheet based on multiple conditions

I have two sheets in Excel, Sheet1 and Sheet2.
They both contain 3 columns A, B and C.
My goal is to get values from C in Sheet2 to C in Sheet1, based on conditions comparing the values in both A and B at the same time.
A in Sheet2 contains numbers grouped together, for example 11,11,13,13,12,12. A in Sheeet1 contains some of those numbers, but not nessecarily in the same order or the same number of rows, for example 11,11,12,13,13.
B in Sheet2 also contains numbers like 2,1,1,2,1,2. B in Sheet1 again contains part of those numbers. For example, 1,2,1,1,2.
There are only unique combinations of pairs in A and B (in that specific order) for Sheet1 and Sheet2 respectively.
C in Sheet2 consists of numbers connected to the specific combination of numbers in A and B.
Now, I want to fill C in Sheet1 based on the values from C in Sheet2. For example for C1: Get the value (row x) in 'Sheet2'!Cx, so that 'Sheet1'!A1='Sheet2'!Ax, AND 'Sheet1'!B1='Sheet2'!Bx (which would be the 2nd row in this example).
I was thinking about something like
C1=INDEX('Sheet2'!C:C;...)
where
...=IF(AND(MATCH(A1;'Sheet2'!A:A;0);MATCH(B1;'Sheet2'!B:B;0));?;?)
?= I don't know what I would write here, but I would want the return value of IF be the row number where both conditions are true.
The problem is that MATCH only returns the first number in A and B respectively for which the condition is true, while I have several non-unique numbers in A. I would want to look through the whole 'Sheet2'!A:A and get all the matching values, and then look through the corresponding 'Sheet2'!B:B to check the second condition.
Or there might be a completely different take on this problem. Do someone have a suggestion on how to solve this?
Here is a way to look at multiple values in a MATCH() function, example:
Sheet1:
Sheet2:
Formula in C2 sheet1:
{=IFERROR(INDEX(Sheet2!$C$2:$C$6,MATCH(Sheet1!A2&Sheet1!B2,Sheet2!$A$2:$A$6&Sheet2!$B$2:$B$6,0)),"")}
Note: It's an array formula so enter through CtrlShiftEnter
Result:
C1 Formula =INDEX(Sheet2!C:C;MATCH(A1;Sheet2!A:A;0);MATCH(B1;Sheet2!B:B;0))

Sorting and finding duplicates in excel columns

I have five columns in my spreadsheet, three of which are filled with assorted names( the first, fourth and fifth columns).
I need a way to cross-reference each cell in the A column with the D and E columns, then have an output that answers the question in the B and C column (which you can see as the Xs), as to whether it was found. I've tried a combination of VLOOKUP and MATCH, but this is proving to be out of my realm. I haven't used excel much lately.
EDIT: Added a picture instead of a diagram
In Cell B3 use =IF(COUNTIFS(D:D,A:A),"","X")
and in C3 use =IF(COUNTIFS(D:D,A:A),"","X")
copy down as far as required
Formula says "If count of names in D:D equal to name in current row in A:A is > 0 then return blank, else return "X"
Test is case-insensitive.

Find the difference between data in 2 columns in Excel

I have 2 sets of data. I put it in Excel e.g. column A and column B. Now I want to know which data from B is part of column A. I run this formula =IF(COUNTIF($A$1:$A$327238,B1)>0,"Exist", "Nope")
Then I 'filter it and look only 'Exist'. Based on that I know that all data in B that has label 'Exist' is part of column A
Now I want to know opposite i.e. which data from A are part of B. For that reason I use the same formula but I replace the data in columns i.e. data from B now in A and vice versa.
Then I randomly verify results.
For case 1 it looks it works fine but for second case it looks it's not accurate.
My assumption: should it work in case 2 as well ( maybe I just was not very accurate in some way ) and I should expect it to work?
Thanks
In cell C1 (assuming your data starts from 1st row) type the following =IF(A2=B2,"equal","no"), and then populate the same formula to the last row where there is still data, so that for row N, your formula in column C is =IF(AN=BN,"equal","no"). After that you will just need to count the cells with value "no" to know the differences. Sorry if I didn't get the question correctly.
Ok, assuming that the two sets of data are in columns A and B (they might be of different sizes), and the last rows of data are L and M respectively, click on D1 and type the following: =IFNA(INDEX(B$1:B$5,MATCH(A1,B$1:B$5,0),1),"Unique"). Drag down to apply this formula on D1 - DL. That's it, you have the duplicate elements. Since the duplicate elements are the same in both columns - A and B, you don't need to repeat this for column B. Note, that for all the unique elements the corresponding rows of column D have the word "Unique", so if you want the unique elements, you can just get the elements from A with the mentioned row numbers:
Just select any column's first row cell and type the following formula: =IF(D1="Unique",INDEX(A$1:A$L,ROW(D1)),"Duplicate").

Resources