Logic evaluation problem in SUMPRODUCT Formula - excel

I am using a formula based on SUMPRODUCT, SUBTOTAL, and OFFSET. To enable count of visible rows only with criteria. I a trying it on a simple sample data which as follows. Data starts from B4 in the Range B4:B12 Header B3:
B Column
HD
2
2
4
6
2
1
8
9
2
Formula is :
=SUMPRODUCT((B4:B12=B4)*(SUBTOTAL(103,OFFSET(B4,ROW(B4:B12)-MIN(ROW(B4:B12)),0))))
It gives correct result of 4 counts for a value of 2.
I went for evaluation of the formula to fully understand its logic. I could comprehend major part of its logic but certain steps are not quite clear to me. I am reproducing evaluation steps below with my comments.
Step -1
=SUMPRODUCT(({2;2;4;6;2;1;8;9;2}=2)*(SUBTOTAL(103,OFFSET(B4,ROW(B4:B12)-MIN(ROW(B4:B12)),0))))
OK
Step -2
=SUMPRODUCT(({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE})*(SUBTOTAL(103,OFFSET(B4,ROW(B4:B12)-MIN(ROW(B4:B12)),0))))
OK
STEP-3
=SUMPRODUCT(({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE})*(SUBTOTAL(103,OFFSET(B4,ROW(B4:B12)-MIN(ROW(B4:B12)),0))))
OK
STEP-4
=SUMPRODUCT(({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE})*(SUBTOTAL(103,OFFSET($B$4,{4;5;6;7;8;9;10;11;12}-MIN({4;5;6;7;8;9;10;11;12}),0))))
OK
STEP-5
=SUMPRODUCT(({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE})*(SUBTOTAL(103,OFFSET($B$4,{4;5;6;7;8;9;10;11;12}-4),0))))
OK
STEP-6
=SUMPRODUCT({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE}*(SUBTOTAL(103,OFFSET($B$4,{0;1;2;3;4;5;6;7;8},0))))
Why {0;1;2;3;4;5;6;7;8} ??
STEP-7
=SUMPRODUCT({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE}*(SUBTOTAL(103,{#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;})))
Why {#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;} ??
STEP-8
=SUMPRODUCT({TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE}*({1;1;1;1;1;1;1;1;1}))
How 1 instead of #VALUE!
STEP-9
=SUMPRODUCT({1;1;0;0;1;0;0;0;1})
OK
Step -10
4
OK
I am not having full clarity on the following points
STEP-6 : Why {0;1;2;3;4;5;6;7;8}
STEP-7: Why {#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;}
STEP-8: How 1 instead of #VALUE!
Hope Someone helps in clarifying the logic behind these mentioned spots. Please forgive me for asking clarity on such a trivial matter.

STEP-6 : Why {0;1;2;3;4;5;6;7;8}
Because the {4;5;6;7;8;9;10;11;12}-4 evaluates to {4-4;5-4;6-4;7-4;8-4;9-4;10-4;11-4;12-4} which is {0;1;2;3;4;5;6;7;8}
STEP-7: Why {#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!}
The formula evaluator fails getting the values out of the 9 cell references got via OFFSET($B$4,{0;1;2;3;4;5;6;7;8},0) = {$B$4;$B$5;$B$6;$B$7;$B$8;$B$9;$B$10;$B$11;$B$12} in array context. But that does not matter because:
STEP-8: How 1 instead of #VALUE!
the SUBTOTAL(103,... is a COUNTA subtotal which, for each single cell reference of the 9 cell references got in step 7, counts 1 if it is not hidden, else 0. So it does not matter whether the cell values was evaluated or not.
Btw.: The same can be achieved using
=SUMPRODUCT((B4:B12=B4)*(SUBTOTAL(103,INDIRECT("B"&ROW(B4:B12)))))
Annotation:
Such formulas are result of trial and error. I doubt any Excel programmer was able predicting all usages of the functions they implemented. There are usages of Excel functions in the wild which are as much thought outside the box that they originally could not have thought so.
Bonus:
=SUMPRODUCT(OFFSET(B4,ROW(B4:B12)-MIN(ROW(B4:B12)),0))
results in 0 using your values in B4:B12.
Here the formula evaluator also fails getting the values out of the 9 cell references got via OFFSET($B$4,{0;1;2;3;4;5;6;7;8},0) = {$B$4;$B$5;$B$6;$B$7;$B$8;$B$9;$B$10;$B$11;$B$12} in array context. And the result is {#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!}. But now it matters because we need the values.
In that case we can use N function to force getting the values
=SUMPRODUCT(N(OFFSET(B4,ROW(B4:B12)-MIN(ROW(B4:B12)),0)))
This results in 36, the sum of your values in B4:B12.

Related

How certain arrays and array functions are handled under the hood in Excel; specifically the dependence of array handling on the calling function

In trying to systematically enumerate the possibilities when rolling four identical but loaded four-sided dice, I came across some unusual excel behavior. Hoping someone can shed some light on what's going on under the hood.
The following table illustrates the possible rolls of a die:
1000 A
0100 B
0010 C
0001 D
each row is a possibility with a distinct probability. In excel, this information can be made to occupy a 4x4 cell area--that is, the letter labels above are merely for convenience.
In trying to display all possible combinations of four rolls of such a die-- where the fist combination might be A + A + A + A or 4000, the second might be B + A + A + A or 3100, and so on for each of the 4^4=256 possibilities--I decided that I wanted to systematically offset A by 0,1,2, or 3 rows for each of four rolls then sum the results. In other words, each possible group of 4 roles can be thought of as 4 copies of row A, each of which offset by some number of rows between 0 and 3, for example {0;0;0;0} or {1;0;0;0} in the first and second case enumerated directly above.
Oddly, though, I get the following. (all formulas are array formulas keyed in with shift+ctrl+enter).
=TRANSPOSE( SUM( OFFSET( A, 4x1ArrayOfRowOffsets, 0)))
displays the correct sum when entered into a 1x4 range. Likewise if =TRANSPOSE(...) is replaced by =INDEX(...,1,1). I take it because both functions natively support array arguments. However,
=SUM( OFFSET( A, 4x1ArrayOfRowOffsets, 0))
does not work--it seems that here the summation is conducted along the 4 rows returned by offset, each of which has value 1--it incorrectly displays only the value 1, even when evaluated in a multicell range as an array formula. Oddly,
=SUM( TRANSPOSE( OFFSET( A, 4x1ArrayOfRowOffsets, 0)))
does not work either--the transpose makes it so the summation is properly conducted along the columns returned by offset, but seems to throw out all but the first column.
Please note that, although the problem statement does not involve VBA, the lack of transparent array formula auditing in Excel proper (intermediate steps return #VALUE errors even when the final answer computes) likely means that, in order to investigate this problem, someone will have to write a bit of VBA that calls worksheet functions and manually outputs the intermediate calculations. This is why I posed a version of this question, here.
Interweaving INDEX calls anywhere but the outside/first function call does not fix the problem.
To try and see what is going on, I investigated further.
=INDEX( OFFSET( A, {w;x;y;z}, 0), 1, {1,2,3,4})
correctly displays the four rolls when entered into a 4x4 range. As before w, x, y, and z are integers between 0 and 3 indicating row offsets from "A" in the table, above. Furthermore,
=COLUMNS( OFFSET( A, {w;x;y;z}, 0))
returns the following when entered into a 5x5 range:
4 4 4 4 4
4 4 4 4 4
4 4 4 4 4
4 4 4 4 4
n/a n/a n/a n/a
All that is to say, calling SUM(OFFSET(---)) with array arguments seems to produce varied output depending on what is doing the calling--specifically, whether or not the caller is a function which natively accepts proper array arguments. Why is this? What is actually going on, here?

Excel sum based on matrix condition and multiple criteria

Following from the example here I'm trying to add additional conditions to a sum formula. I've represented an example below:
The output that I'm looking for for example for Jan 2017 is
2017
1
UP A 1
UP B 6
UP C 6
DOWN A 1
DOWN B 8
DOWN C 7
I tried with the following formula:
=MMULT(--($B$17:$C$17="X"),MATCH(1,($A23=$C$2:$C$14)*(C$21=$A$2:$A$14)*(C$22=$B$2:$B$14)*($E$2:$E$14=$D$2:$D$14),0))
but I get a N/A value.
Does anyone know it if is possible to do it?
In your first example the number of rows in array1 and number of columns in array2 were equal, five. Here you have two columns and 13 rows. That they are unequal here is part (all) of the reason why you are having an issue.
Also your match function is returning a Boolean not an array
I have a way to do this using matrix condition and multiple criteria but had to change problem up a bit, see photo for example:
{=MMULT(--(D18:P18="x"),E$2:E$14*(--(A$2:A$14=$C$21)*--(B$2:B$14=$C$22)*--(C$2:C$14=A24)))"
https://i.stack.imgur.com/FEvgR.png
You can create a formula to fill the second matrix with X's see below
=IF(OR(INDIRECT("D"&VALUE(D20))=$A$18,INDIRECT("D"&VALUE(D20))=$B$18),"X","")
https://i.stack.imgur.com/4rS4L.png
That being said I don't think this is particularly efficient as you are treating the one of the matrixes as a all 1's so you basically just adding an extra criteria / Boolean with added complexity....that being said u asked for this specifically and I believe that I have delivered that LOL
Just add two SUMIFS together.
=SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 1)="x", $B$16))+
SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 2)="x", $C$16))

combining IF and AND statement not working in excel

I am trying to calculate the percentage, here the rules are as follows:
01. Employees working in IT department for more than 10 years will get 7% and rest of the IT guys will get 6.5%
02. And for rest of the departments, we have different percentages
Here H column represents various departments and F is working experience and in column I we're getting the main value from which we have to calculate the percentages.
Here's what I tried
=IF(AND(H5="IT",F5<10),I5*6.5%,I5*7%,IF(H5="PRODUCTION",I5*9%,IF(H5="MARKETING",I5*6%,IF(H5="LAW",I5*6%,IF(H5="HR",I5*9.36%)))))
This is showing You've entered too many arguments
Your first If statement, really had one too many argument.
=IF(AND(H5="IT",F5<10),I5*6.5%,IF(H5="PRODUCTION",I5*9%,IF(H5="MARKETING",I5*6%,IF(H5="LAW",I5*6%,IF(H5="HR",I5*9.36%,I5*7%)))))
Since each new if is nested as the FALSE eventuality. Look at the end for that 7%.
Edit:
My bad(reading your comment made me realize), there are two error in your formula.
One has been discussed, the second is how you nested those IT percentage.
=IF(AND(H5="IT",F5>=10),I5*7%,IF(AND(H5="IT",F5<10),I5*6.5%,IF(H5="PRODUCTION",I5*9%,IF(H5="MARKETING",I5*6%,IF(H5="LAW",I5*6%,IF(H5="HR",I5*9.36%,""))))))
In this version I added another IF statement. We could have avoided it by defining either one of the IT's rate as the ELSE result same as my initial answer. Now all predictable eventuality have their own TRUE match. An unexpected value will return an empty string. Maybe you want it to be 0 instead...
(1) IF(AND(...))= conditions
(2) I5 * 6.5% = value if true
(3) I5 * 7% = value if false
Therefore, you already indicate value if false and another if statement is not allowed. I would remove (3) and put it to the very end of your formula. As such, once none of the previous conditions are met, it will automatically take on the value of 7%.

Ranking when there are duplicates

How can I return the ranking of each value in a row, even in the case of duplicates? Please see my example below.
While many questions have been answered regarding the handling of duplicate values in a ranking, I have come short in achieving a method that works for all of my cases.
EDIT: The previous picture above was a bad example that did not address my problem. Here is a new picture of the behavior.
In certain cases it skips to 7 when the rank should only be 1:6. In other cases it seems to work, and then not work in similar cases. Data is:
2.61879723030607 2.3428 2.61879723030607 2.4571 2.7324 2.1790
2.97203355745108 2.5355 2.97203355745108 2.6721 3.0561 2.4136
2.4895 2.2781 2.6218 2.4369 2.6898 2.1361
2.32650000000000 2.2124 2.3453 2.32650000000000 2.3938 2.0283
2.34132608128450 2.1331 2.34132608128450 2.2800 2.5758 2.0446
2.58668483692925 2.1476 2.58668483692925 2.3019 2.5124 2.0135
2.2555 2.0884 2.3368 2.0980 2.3928 1.9787
2.32878217762168 2.1080 2.32878217762168 2.1250 2.5360 1.9807
2.50891263421977 2.2480 2.50891263421977 2.4239 2.9070 2.2638
2.97755287506272 2.4457 2.97755287506272 2.6830 3.0566 2.3987
3.0850 2.5380 5.3880 2.8304 3.1579 2.5030
3.0120 2.3815 3.0639 2.6762 3.0831 2.4253
2.49235468138485 2.1436 2.49235468138485 2.3159 2.5542 1.9991
2.13109025589563 2.1060 2.13109025589563 2.1555 2.3225 1.9787
2.24900295032614 2.0332 2.24900295032614 2.1780 2.5084 2.0043
2.4010 2.0438 2.5857 2.2126 2.4511 2.0329
EDIT2: Implementing RANK instead of RANK.EQ showing no difference:
I think you've got an error in your setup. My understanding is each row is meant to be a separate independent case, however your formula for calculating rank has fixed row and column references, when it should have only fixed column references. Right now, the rank for every value is being found based on the first row in your data. Instead of:
=RANK.EQ(B4,$B$4:$G$4,1)
It should be:
=RANK.EQ(B4,$B4:$G4,1)
This then alters your results in the 2nd and 3rd blocks and you should get the desired result in the 3rd block.
With the formula below in Cell B2:B4 you can filter the unique numbers in Column A.
Please note that this is an array formula so once you enter it you have to mark it and press CTRL + ALT + DEL. Hope this solves your problem. More details regarding this formula you can also find here https://exceljet.net/formula/extract-unique-items-from-a-list
Column A Column B
1
1 1 = {=INDEX($A$1:$A$5000,MATCH(0,COUNTIF($B$1:B1,$A$1:$A$5000),0))}
1 2 = {=INDEX($A$1:$A$5000,MATCH(0,COUNTIF($B$1:B2,$A$1:$A$5000),0))}
1 6 = {=INDEX($A$1:$A$5000,MATCH(0,COUNTIF($B$1:B3,$A$1:$A$5000),0))}
1
1
1
1
1
1
1
2
1
6
6
6
6
6
6
6
6
6
6
6
6
6
Try RANK instead of RANK.EQ as below. Though I am not sure whether this will work as I am testing on Excel 07.
Enter the following formula in Cell H1
=RANK(A1,$A1:$F1,1)+COUNTIF($A1:A1,A1)-1
Copy/Drag the formula down and across (to right) as required. See image for reference.
As per Microsoft Documentation on RANK.EQ function here
RANK.EQ gives duplicate numbers the same rank. However, the presence of duplicate numbers affects the ranks of subsequent numbers. For example, in a list of integers sorted in ascending order, if the number 10 appears twice and has a rank of 5, then 11 would have a rank of 7 (no number would have a rank of 6)

MIN array function non zeros only

I have been trying to get this array function to output (non-zero) minimum values in the 'FINAL DATA' AE column. Can you see a structural error in this formula?
=IF($C$4="All EMEA",
MIN(IF('FINAL DATA'!$2:$AE$250000<>0,
('FINAL DATA'!$J$2:$J$250000=$C$4)*('FINAL DATA'!$E$2:$E$250000=$E$4)*( 'FINAL DATA'!$AE$2:$AE$250000))),
MIN(IF('FINAL DATA'!$AE$2:$AE$250000<>0,
('FINAL DATA'!$K$2:$K$250000=$C$4)*('FINAL DATA'!$E$2:$E$250000=$E$4)*( 'FINAL DATA'!$AE$2:$AE$250000)))
)
By using <>0 that will eliminate zeroes and blanks, so that isn't the problem.....[although if you only want to eliminate blanks and have zero as a valid return value you should use <>""]
You can't multiply the conditions with the number range because by multiplying you get zeroes for any rows where the conditions are not satisfied, use multiple IFs instead, like this:
=MIN(IF('FINAL DATA'!$AE$2:$AE$250000<>0,IF('FINAL DATA'!$J$2:$J$250000=$C$4,IF('FINAL DATA'!$E$2:$E$250000=$E$4,'FINAL DATA'!$AE$2:$AE$250000))))
Second line, you have !$2, no column specified.
MIN(IF('FINAL DATA'!$2:$AE$250000<>0,
Also, it looks like you are trying to run a single If comparison against a range, which I don't think will work the way you are trying to use it.
Barry has identified the core problem (tests returnimg 0 to the MIN function).
Here's a refactor of your formula (still an array formula) that solves this, and is quite a bit shorter
=MIN(IF(($S:$S<>0)*($E:$E=$E$4)*(IF($C$4="All EMEA",$J:$J,$K:$K)=$C$4),
($S:$S)))
Note that this (as would your original formaul, when fixed) will return 0 if there are no qualifying values >0 in the ranges
You can eliminate the zeros by using an IF() function in an array formula. Consider the following:
A
Row -----
1 0
2 7
3 5
4 6
5
6 3
The array formula =MIN(IF($A$1:$A$6>0,$A$1:$A$6)) will return 3 because the 0 and blank cell are eliminated with the >0 portion of the if statement.

Resources