Excel sum based on matrix condition and multiple criteria - excel

Following from the example here I'm trying to add additional conditions to a sum formula. I've represented an example below:
The output that I'm looking for for example for Jan 2017 is
2017
1
UP A 1
UP B 6
UP C 6
DOWN A 1
DOWN B 8
DOWN C 7
I tried with the following formula:
=MMULT(--($B$17:$C$17="X"),MATCH(1,($A23=$C$2:$C$14)*(C$21=$A$2:$A$14)*(C$22=$B$2:$B$14)*($E$2:$E$14=$D$2:$D$14),0))
but I get a N/A value.
Does anyone know it if is possible to do it?

In your first example the number of rows in array1 and number of columns in array2 were equal, five. Here you have two columns and 13 rows. That they are unequal here is part (all) of the reason why you are having an issue.
Also your match function is returning a Boolean not an array
I have a way to do this using matrix condition and multiple criteria but had to change problem up a bit, see photo for example:
{=MMULT(--(D18:P18="x"),E$2:E$14*(--(A$2:A$14=$C$21)*--(B$2:B$14=$C$22)*--(C$2:C$14=A24)))"
https://i.stack.imgur.com/FEvgR.png
You can create a formula to fill the second matrix with X's see below
=IF(OR(INDIRECT("D"&VALUE(D20))=$A$18,INDIRECT("D"&VALUE(D20))=$B$18),"X","")
https://i.stack.imgur.com/4rS4L.png
That being said I don't think this is particularly efficient as you are treating the one of the matrixes as a all 1's so you basically just adding an extra criteria / Boolean with added complexity....that being said u asked for this specifically and I believe that I have delivered that LOL

Just add two SUMIFS together.
=SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 1)="x", $B$16))+
SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 2)="x", $C$16))

Related

How certain arrays and array functions are handled under the hood in Excel; specifically the dependence of array handling on the calling function

In trying to systematically enumerate the possibilities when rolling four identical but loaded four-sided dice, I came across some unusual excel behavior. Hoping someone can shed some light on what's going on under the hood.
The following table illustrates the possible rolls of a die:
1000 A
0100 B
0010 C
0001 D
each row is a possibility with a distinct probability. In excel, this information can be made to occupy a 4x4 cell area--that is, the letter labels above are merely for convenience.
In trying to display all possible combinations of four rolls of such a die-- where the fist combination might be A + A + A + A or 4000, the second might be B + A + A + A or 3100, and so on for each of the 4^4=256 possibilities--I decided that I wanted to systematically offset A by 0,1,2, or 3 rows for each of four rolls then sum the results. In other words, each possible group of 4 roles can be thought of as 4 copies of row A, each of which offset by some number of rows between 0 and 3, for example {0;0;0;0} or {1;0;0;0} in the first and second case enumerated directly above.
Oddly, though, I get the following. (all formulas are array formulas keyed in with shift+ctrl+enter).
=TRANSPOSE( SUM( OFFSET( A, 4x1ArrayOfRowOffsets, 0)))
displays the correct sum when entered into a 1x4 range. Likewise if =TRANSPOSE(...) is replaced by =INDEX(...,1,1). I take it because both functions natively support array arguments. However,
=SUM( OFFSET( A, 4x1ArrayOfRowOffsets, 0))
does not work--it seems that here the summation is conducted along the 4 rows returned by offset, each of which has value 1--it incorrectly displays only the value 1, even when evaluated in a multicell range as an array formula. Oddly,
=SUM( TRANSPOSE( OFFSET( A, 4x1ArrayOfRowOffsets, 0)))
does not work either--the transpose makes it so the summation is properly conducted along the columns returned by offset, but seems to throw out all but the first column.
Please note that, although the problem statement does not involve VBA, the lack of transparent array formula auditing in Excel proper (intermediate steps return #VALUE errors even when the final answer computes) likely means that, in order to investigate this problem, someone will have to write a bit of VBA that calls worksheet functions and manually outputs the intermediate calculations. This is why I posed a version of this question, here.
Interweaving INDEX calls anywhere but the outside/first function call does not fix the problem.
To try and see what is going on, I investigated further.
=INDEX( OFFSET( A, {w;x;y;z}, 0), 1, {1,2,3,4})
correctly displays the four rolls when entered into a 4x4 range. As before w, x, y, and z are integers between 0 and 3 indicating row offsets from "A" in the table, above. Furthermore,
=COLUMNS( OFFSET( A, {w;x;y;z}, 0))
returns the following when entered into a 5x5 range:
4 4 4 4 4
4 4 4 4 4
4 4 4 4 4
4 4 4 4 4
n/a n/a n/a n/a
All that is to say, calling SUM(OFFSET(---)) with array arguments seems to produce varied output depending on what is doing the calling--specifically, whether or not the caller is a function which natively accepts proper array arguments. Why is this? What is actually going on, here?

Find values occurring in multiple columns in excel

I have sets of gene probes that are upregulated when put under different chemical stresses. Each column contains all of the upregulated gene probes. I have 12 columns, how do I get a list of gene probes that appear in all 12 columns?
I've been able to find similarities between two columns using the formula
=IF(ISERROR(MATCH(A2,$C$2:$C$21473,0)),"",A2)
but cant work out how to adapt it to include 12 columns
G.Ac G.As G.At G.Ac.At G.As.Ac G.As.At G.Cd G.Cu G.Ni
G.Cd.Cu G.Cd.Ni G.Ni.Cu
GENE:JGI_V11_3346220103 GENE:JGI_V11_2653050203 GENE:JGI_V11_3299790103
GENE:JGI_V11_359040103 GENE:JGI_V11_2228010103 GENE:JGI_V11_2662750203
GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303 GENE:JGI_V11_3119540303
GENE:JGI_V11_3134270203 GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303
GENE:JGI_V11_3164760203 GENE:JGI_V11_565470303 GENE:JGI_V11_2296170203
GENE:JGI_V11_2045300203 GENE:JGI_V11_2421620203 GENE:JGI_V11_2228010303
GENE:JGI_V11_2196580303 GENE:JGI_V11_3134270203 GENE:JGI_V11_3119540203
GENE:JGI_V11_1926920103 GENE:JGI_V11_1926920103 GENE:JGI_V11_1014720202
GENE:JGI_V11_478830203 GENE:JGI_V11_3168730303 GENE:JGI_V11_3311070202
GENE:JGI_V11_3216620102 GENE:JGI_V11_2653050303 GENE:JGI_V11_3300140202
GENE:JGI_V11_2653050303 GENE:JGI_V11_1159220202 GENE:JGI_V11_2024180303
GENE:JGI_V11_1926920303 GENE:JGI_V11_2196580303 GENE:JGI_V11_1159220202
GENE:JGI_V11_3164760303 GENE:JGI_V11_2228010203 GENE:JGI_V11_2341670203
GENE:JGI_V11_1938910303 GENE:JGI_V11_3026230203 GENE:JGI_V11_2449230203
GENE:JGI_V11_3134270303 GENE:JGI_V11_2235750203 GENE:JGI_V11_1981410203
GENE:JGI_V11_3251310202 GENE:JGI_V11_977750103 GENE:JGI_V11_954070203
GENE:JGI_V11_2267320203 GENE:JGI_V11_2268000303 GENE:JGI_V11_2226270101
GENE:JGI_V11_3003640303 GENE:JGI_V11_223520203 GENE:JGI_V11_2662750103
GENE:JGI_V11_2228010103 GENE:JGI_V11_3251310202 GENE:JGI_V11_3198630203
GENE:JGI_V11_3134270303 GENE:JGI_V11_1926920203 GENE:JGI_V11_287750103
GENE:JGI_V11_465160203 GENE:JGI_V11_2268000203 GENE:JGI_V11_2473230303
GENE:JGI_V11_3192220102 GENE:JGI_V11_3026230303 GENE:JGI_V11_3039310303
GENE:JGI_V11_1926920103 GENE:JGI_V11_1159220102 GENE:JGI_V11_3052790202
GENE:JGI_V11_3075830303 GENE:JGI_V11_2196580203 GENE:JGI_V11_3134280203
GENE:JGI_V11_3142970303 GENE:JGI_V11_503720303 GENE:JGI_V11_2236410103
GENE:JGI_V11_3042230103 GENE:JGI_V11_2228010203 GENE:JGI_V11_3028210101
GENE:JGI_V11_2105710303 GENE:JGI_V11_1926920303 GENE:JGI_V11_2131620103
GENE:JGI_V11_1002840203 GENE:JGI_V11_2088480203 GENE:JGI_V11_3196120102
Heres the first 8 rows of the 12 columns. There are 21473 rows in total.
Thanks
You could use an array formula like this to count how many columns a particular gene probe occurs in
=SUM(--(MMULT(TRANSPOSE(ROW(A$2:L$10000)^0),N(A$2:L$10000=A2))>0))
This is a standard way of getting column totals for a 2D array - in this case an array of true/false values corresponding to instances of an array element being equal/unequal to A2.
It is rather a brute force approach - it needs ~120K multiplications for each row. If you copy the formula down for ~10K rows, there is a delay of ~100 seconds on my computer while Excel works out the results.
Must be entered as an array formula using CtrlShiftEnter
In this dummy data C is the only value that occurs in all 12 columns.

Ranking when there are duplicates

How can I return the ranking of each value in a row, even in the case of duplicates? Please see my example below.
While many questions have been answered regarding the handling of duplicate values in a ranking, I have come short in achieving a method that works for all of my cases.
EDIT: The previous picture above was a bad example that did not address my problem. Here is a new picture of the behavior.
In certain cases it skips to 7 when the rank should only be 1:6. In other cases it seems to work, and then not work in similar cases. Data is:
2.61879723030607 2.3428 2.61879723030607 2.4571 2.7324 2.1790
2.97203355745108 2.5355 2.97203355745108 2.6721 3.0561 2.4136
2.4895 2.2781 2.6218 2.4369 2.6898 2.1361
2.32650000000000 2.2124 2.3453 2.32650000000000 2.3938 2.0283
2.34132608128450 2.1331 2.34132608128450 2.2800 2.5758 2.0446
2.58668483692925 2.1476 2.58668483692925 2.3019 2.5124 2.0135
2.2555 2.0884 2.3368 2.0980 2.3928 1.9787
2.32878217762168 2.1080 2.32878217762168 2.1250 2.5360 1.9807
2.50891263421977 2.2480 2.50891263421977 2.4239 2.9070 2.2638
2.97755287506272 2.4457 2.97755287506272 2.6830 3.0566 2.3987
3.0850 2.5380 5.3880 2.8304 3.1579 2.5030
3.0120 2.3815 3.0639 2.6762 3.0831 2.4253
2.49235468138485 2.1436 2.49235468138485 2.3159 2.5542 1.9991
2.13109025589563 2.1060 2.13109025589563 2.1555 2.3225 1.9787
2.24900295032614 2.0332 2.24900295032614 2.1780 2.5084 2.0043
2.4010 2.0438 2.5857 2.2126 2.4511 2.0329
EDIT2: Implementing RANK instead of RANK.EQ showing no difference:
I think you've got an error in your setup. My understanding is each row is meant to be a separate independent case, however your formula for calculating rank has fixed row and column references, when it should have only fixed column references. Right now, the rank for every value is being found based on the first row in your data. Instead of:
=RANK.EQ(B4,$B$4:$G$4,1)
It should be:
=RANK.EQ(B4,$B4:$G4,1)
This then alters your results in the 2nd and 3rd blocks and you should get the desired result in the 3rd block.
With the formula below in Cell B2:B4 you can filter the unique numbers in Column A.
Please note that this is an array formula so once you enter it you have to mark it and press CTRL + ALT + DEL. Hope this solves your problem. More details regarding this formula you can also find here https://exceljet.net/formula/extract-unique-items-from-a-list
Column A Column B
1
1 1 = {=INDEX($A$1:$A$5000,MATCH(0,COUNTIF($B$1:B1,$A$1:$A$5000),0))}
1 2 = {=INDEX($A$1:$A$5000,MATCH(0,COUNTIF($B$1:B2,$A$1:$A$5000),0))}
1 6 = {=INDEX($A$1:$A$5000,MATCH(0,COUNTIF($B$1:B3,$A$1:$A$5000),0))}
1
1
1
1
1
1
1
2
1
6
6
6
6
6
6
6
6
6
6
6
6
6
Try RANK instead of RANK.EQ as below. Though I am not sure whether this will work as I am testing on Excel 07.
Enter the following formula in Cell H1
=RANK(A1,$A1:$F1,1)+COUNTIF($A1:A1,A1)-1
Copy/Drag the formula down and across (to right) as required. See image for reference.
As per Microsoft Documentation on RANK.EQ function here
RANK.EQ gives duplicate numbers the same rank. However, the presence of duplicate numbers affects the ranks of subsequent numbers. For example, in a list of integers sorted in ascending order, if the number 10 appears twice and has a rank of 5, then 11 would have a rank of 7 (no number would have a rank of 6)

MIN array function non zeros only

I have been trying to get this array function to output (non-zero) minimum values in the 'FINAL DATA' AE column. Can you see a structural error in this formula?
=IF($C$4="All EMEA",
MIN(IF('FINAL DATA'!$2:$AE$250000<>0,
('FINAL DATA'!$J$2:$J$250000=$C$4)*('FINAL DATA'!$E$2:$E$250000=$E$4)*( 'FINAL DATA'!$AE$2:$AE$250000))),
MIN(IF('FINAL DATA'!$AE$2:$AE$250000<>0,
('FINAL DATA'!$K$2:$K$250000=$C$4)*('FINAL DATA'!$E$2:$E$250000=$E$4)*( 'FINAL DATA'!$AE$2:$AE$250000)))
)
By using <>0 that will eliminate zeroes and blanks, so that isn't the problem.....[although if you only want to eliminate blanks and have zero as a valid return value you should use <>""]
You can't multiply the conditions with the number range because by multiplying you get zeroes for any rows where the conditions are not satisfied, use multiple IFs instead, like this:
=MIN(IF('FINAL DATA'!$AE$2:$AE$250000<>0,IF('FINAL DATA'!$J$2:$J$250000=$C$4,IF('FINAL DATA'!$E$2:$E$250000=$E$4,'FINAL DATA'!$AE$2:$AE$250000))))
Second line, you have !$2, no column specified.
MIN(IF('FINAL DATA'!$2:$AE$250000<>0,
Also, it looks like you are trying to run a single If comparison against a range, which I don't think will work the way you are trying to use it.
Barry has identified the core problem (tests returnimg 0 to the MIN function).
Here's a refactor of your formula (still an array formula) that solves this, and is quite a bit shorter
=MIN(IF(($S:$S<>0)*($E:$E=$E$4)*(IF($C$4="All EMEA",$J:$J,$K:$K)=$C$4),
($S:$S)))
Note that this (as would your original formaul, when fixed) will return 0 if there are no qualifying values >0 in the ranges
You can eliminate the zeros by using an IF() function in an array formula. Consider the following:
A
Row -----
1 0
2 7
3 5
4 6
5
6 3
The array formula =MIN(IF($A$1:$A$6>0,$A$1:$A$6)) will return 3 because the 0 and blank cell are eliminated with the >0 portion of the if statement.

Return a row number that matches multiple criteria in vbs excel

I need to be able to search my whole table for a row that matches multiple criteria. We use a program that outputs data in the form of a .csv file. It has rows that separate sets of data, each of these headers don't have any columns that are unique in of them self but if i searched the table for multiple values i should be able to pinpoint each header row. I know i can use Application.WorksheetFunction.Match to return a row on a single criteria but i need to search on two three or four criteria.
In pseudo-code it would be something like this:
Return row number were column A = bill & column B = Woods & column C = some other data
We need to work with arrays:
There are 2 kinds of arrays:
numeric {1,0,1,1,1,0,0,1}
boolean {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
to convert between them we can use:
MATCH function
MATCH(1,{1,0,1,1,1,0,0,1},0) -> will result {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
simple multiplication
{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}*{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE} -> will result {1,0,1,1,1,0,0,1}
you can can check an array in the match function, entering it like in the picture below, be warned that MATCH function WILL TREAT AN ARRAY AS AN "OR" FUNCTION (one match will result in true
ie:
MATCH(1,{1,0,1,1,1,0,0,1},0)=TRUE
, YOU MUST CTR+SHIFT+ENTER !!! FOR IT TO GIVE AN ARRAY BACK!!!
in the example below i show that i want to sum the hours of all the employees except the admin per case
we have 2 options, the long simple way, the complicated fast way:
long simple way
D2=SUMPRODUCT(C2:C9,(A2=A2:A9)*("admin"<>B2:B9)) <<- SUMPRODUCT makes a multiplication
basically A1={2,3,11,3,2,4,5,6}*{0,1,1,0,0,0,0,0} (IT MUST BE A NUMERIC ARRAY TO THE RIGHT IN SUMPRODUCT!!!)
ie: A1=2*0+3*1+11*1+3*0+2*0+4*0+5*0+6*0
this causes a problem because if you drag the cell to autocomplete the rest of the cells, it will edit the lower and higher values of
ie: D9=SUMPRODUCT(C9:C16,(A9=A9:A16)*("admin"<>B9:B16)), which is out of bounds
same as the above if you have a table and want to view the results in a diferent order
the fast complicated way
D3=SUMPRODUCT(INDIRECT("c2:c9"),(A3=INDIRECT("a2:a9"))*("admin"<>INDIRECT("b2:b9")))
it's the same, except that INDIRECT was used on the cells that we want not be modified when autocompleting or table reorderings
be warned that INDIRECT sometimes give VOLATILE ERROR,i recommend not using it on a single cell or using it only once in an array
f* c* i cant post pictures :(
table is:
case emplyee hours totalHoursPerCaseWithoutAdmin
1 admin 2 14
1 him 3 14
1 her 11 14
2 him 3 5
2 her 2 5
3 you 4 10
3 admin 5 10
3 her 6 10
and for the functions to check the arrays, open the insert function button (it looks like and fx) then doubleclick MATCH and then if you enter inside the Lookup_array a value like
A2=A2:A9 for our example it will give {TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE} that is because only the first 3 lines are from case=1
Something like this?
Assuming that you data in in A1:C20
I am looking for "Bill" in A, "Woods" in B and "some other data" in C
Change as applicable
=IF(INDEX(A1:A20,MATCH("Bill",A1:A20,0),1)="Bill",IF(INDEX(B1:B20,MATCH("Woods",B1:B20,0),1)="Woods",IF(INDEX(C1:C20,MATCH("some other data",C1:C20,0),1)="some other data",MATCH("Bill",A1:A20,0),"Not Found")))
SNAPSHOT
I would use this array* formula (for three criteria):
=MATCH(1,((Range1=Criterion1)*(Range2=Criterion2)*(Range3=Criterion3)),0)
*commit with Ctrl+Shift+Enter

Resources