Excel, Sumproduct, multiple conditions search in {} - excel

I run into several postings on the internet (incl. stackowerflow) with code like this
=SUMPRODUCT((A1:A10="Marketing")*(B1:B10={"North","South"})*(C1:C10))
Conditions for search are neatly put into {}. I have 28 such conditions to search for, so I'm looking for a way to make the formula easier to read. If I try it, i get N/A.
Is there a trick I'm missing?
I'm aware that it can be written
(B1:B10="North") + (B1:B10="South")
but with 28 items it is going to be long.
Thank you in advance
EDIT1: (Disregard)
Tried Axel's suggestion
Simple example
- A B C D
1 1 2 3
2 1 2 2 3
3 2 4 4 6
4 3 6 6 9
=SUMPRODUCT((A2:A4={2,3})*(B2:D4))
Returns Sumproduct(({1,2,3}={2,3})*(B2:D4)) -> I still get N/A for last column when you continue in process
Same for
=SUMPRODUCT((A2:A4=A6:B6)*(B2:D4))
where A6:B6 is list of conditions
or
=SUMPRODUCT((A2:A4=testrange)*(B2:D4))
I'm trying to put all conditions within formula {"case1","case2",...} and so but can't make it work.
Edit 2:
Ok, I see the difference now.
Initial formula is column by column by column
What I'm trying to solve
Column A- list of accounts, I need to find 28 of them
Row 1 - months (conditions varies)
Range B2:AA462 - values
I can write it all with (A2:A462="account1")+(A2:A462="acount2")... up to 28 cases, but I'm asking whether there is a way to write it more simpler
Something like initial A2:A462={"North","South"}
Something like
=Sumproduct((A2:A462={"account1","account2",...})*(B1:AA1="June")*(B2:AA462))
Is there a way write this somehow?
EDIT 4:
Few weeks later inspired by Axel's inputs
=SUMPRODUCT(MMULT(--(A2:A7=G1:J1),ROW(1:4)/ROW(1:4))*(B1:E1=G4)*B2:E7)
Can be grown into
{=SUMPRODUCT(MMULT(--(A2:A7=TRANSPOSE(namedrange)),ROW(OFFSET(A1,0,0,COUNTA(namedrange)))/ROW(OFFSET(A1,0,0,COUNTA(namedrange))))*(B1:E1=G4)*(B2:E7))}
Ok, named range, has conditions within column, more natural way to keep a list of conditions you want to filter for. Also MMULT is now flexible, and counts number of conditions and adjust number of rows to multiply by.
Whole formula must be entered as array formula.

{"North","South"} is the array literal for a row vector. That means it is as if "North" and "South" is placed in juxtaposed cells in one row. So if "North" is in E1 and "South" is in F1, then the formula could also be:
=SUMPRODUCT((A1:A13="Marketing")*(B1:B13=E1:F1)*C1:C13)
With more criteria it could be:
=SUMPRODUCT((A1:A13="Marketing")*(B1:B13=E1:H1)*C1:C13)
It is important that the criterias are in a row vector (one row, multiple columns) since the B1:B13 is a column vector.
Answer to your Edit 2:
The approach:
=SUMPRODUCT(((A2:A462="account1")+(A2:A462="account2")+...+(A2:A462="account28"))*(B1:AA1="June")*B2:AA462)
, which will work, is different from (A2:A462={"account1","account2",...,"account28"}). The latter cannot work since it creates a matrix of 461 rows and 28 columns while the working one ((A2:A462="account1")+(A2:A462="account2")+...+(A2:A462="account28")) is only a vector of 461 rows in one column.
The equivalent could be:
=SUMPRODUCT(MMULT(--(A2:A462={"account1","account2",...,"account28"});ROW(1:28)/ROW(1:28))*(B1:AA1="June")*B2:AA462)
and if "account1","account2",...,"account28" are in AC1:BD1 then also:
=SUMPRODUCT(MMULT(--(A2:A462=AC1:BD1);ROW(1:28)/ROW(1:28))*(B1:AA1="June")*B2:AA462)
What is this doing? It uses MMULT to transform the matrix of 461 rows and 28 columns into a vector of 461 rows by multiplying the matrix with a row vector of 28 rows filled with 1.
So if there is a 1 in one of the 28 columns in each row of the matrix, then there will also be a 1 as the row value of the resulting vector of 461 rows.
Example:
Formula in H3:
=SUMPRODUCT(((A2:A7=G1)+(A2:A7=H1)+(A2:A7=I1)+(A2:A7=J1))*(B1:E1=G3)*B2:E7)
Formula in H4:
=SUMPRODUCT(MMULT(--(A2:A7=G1:J1),ROW(1:4)/ROW(1:4))*(B1:E1=G4)*B2:E7)
To be complete, there also would be an approach using SUMIF inside SUMPRODUCT which would be the better approach in my opinion.
So the Formula in H4 would be:
=SUMPRODUCT(SUMIF(A2:A7,G1:J1,INDEX(B2:E7,0,MATCH(G5,B1:E1,0))))
Your formula would be:
=SUMPRODUCT(SUMIF(A2:A462,AC1:BD1,INDEX(B2:AA462,0,MATCH("June",B1:AA1,0))))

Related

Create a formula to find the count in Excel column

I not very use to use excel complex formula.
Here is request
Column has few number (-ve/+ve). I have to count these based on intervals. These interval are not pre-decided.
See screen Shot
Values in Label col can change in run time. A is less than -15, B is between -15 to 6 and so on. I have to create a formula to add a count in the Count col.
Please guide
Thank you
Regards
You will save yourself a lot of maintenance headaches by re-formatting the "Labels" Table:
D
E
F
G
6
Greater Than or Equal To
Less Than or Equal To
Count
7
Label A
-99
-16
(formula below goes here)
8
Label B
-15
-6
9
Label C
-5
5
10
Label D
6
15
The formula to place in the first count cell (G7 in my example) is:
=COUNTIFS(A$1:A$19,">="&E7,A$1:A$19,"<="&F7)
And then fill it down the length of the table. (In this example 4 rows). Be mindful of the $ that lock the rows of your values column.
I would strongly advise to follow the approach outlined by #Max as it is easier to maintain and less error-prone. However, as you stated, that you are looking for a solution that takes your format into account, I come up with this.
Note:
I split up the complex formula in different pieces to make it easier to explain. You can paste the different pieces together in a single cell per row if you like.
Step 1: Getting rid of unnecessary descriptions
I use a combination of 'SUBSTITUTE', 'MID' and 'FIND' to extract our operators and values from your labels: F3: =SUBSTITUTE(MID(D3,FIND("(",D3)+1,FIND(")",D3)-FIND("(",D3)-1),"interval ","").
The cell D3 contains your label, e.g. "A (interval <-15).
Step 2: Getting the lower threshold
I check whether the label contains two thresholds or not by looking for the "/" character. Next, I handle both situations.
G3: =IF(ISERR(FIND("/",F3)),MID(F3,1,LENGTH(F3)),">="&MID(F3,1,FIND("/",F3)))
The cell F3contains the result of Step 1
Step 4: Getting the upper threshold
Similar to Step 2, except for the operators.
H3: =IF(ISERR(FINN("/",F3)),MID(F3,1,LENGTH(F3)),"<="&MID(F3,FIND("/",F3),LENGTH(F3)-FIND("/",F2)))
Step 5: Counting
I use MID to include the single pieces of Step 3 and Step 4 as text into the formula. If you paste everything together, the use of it will not be necessary.
I3: =COUNTIFS($A$1:$A$19,MID(G3,1,LENGTH(G3)),$A$1:$A$19,MID(H3,1,LENGTH(H3)))
Update: Here's the screenshot.

Sum row based on criteria across multiple columns

I have googled for hours, not being able to find a solution to what I need/want. I have an Excel sheet where I want to sum the values in one column based on the criteria that either one of two columns should have a specific value in it. For instance
A B C
1 4 20 7
2 5 100 3
3 100 21 4
4 15 21 4
5 21 24 8
I want to sum the values in C given that at least one of A and B contains a value of less than or equal to 20. Let us assume that A1:A5 is named A, B1:B5 is named B, and C1:C5 is named C (for simplicity). I have tried:
={SUMPRODUCT(C,((A<=20)+(C<=20)))}
which gives me the rows where both columns match summed twice, and
={SUMPRODUCT(C,((A<=20)*(C<=20)))}
which gives me only the rows where both columns match
So far, I have settled for the solution of adding a column D with the lowest value of A and B, but it bugs me so much that I can't do it with formulas.
Any help would be highly appreciated, so thanks in advance. All I have found when googling is the "multiple criteria for same column" problem.
Thanks. That works. Found another one that works, after I figured out that excel does not treat 1 + 1 = 1 as I learnt in discrete mathematics, but as you say, counts the both the trues. Tried instead with:
{=SUM(IF((A<=20)+(B<=20);C;0))}
But I like yours better.
Your problem that it is "summing twice" in this formula
={SUMPRODUCT(C,((A<=20)+(C<=20)))}
is due to addition turning first TRUE plus the second TRUE into 2. It is not actually summing twice, because for any row, if only one condition is met, it would count that row only once.
The solution is to transform either the 1 or the 2 into a 1, using an IF:
={SUMPRODUCT(C,IF((A<=20)+(C<=20))>0, 1, 0)}
That way, each value in column C would only be counted at max once.
Following this site you could build up your SUMPRODUCT() formula like this:
=SUMPRODUCT(C,SIGN((A<=20)+(C<=20)))
So, instead of a nested IF() you control your or condition with the SIGN()function.
hth
If you plan to use a large set of data then it is best to use the array formula:
{=SUM(IF((A1:A5<=20)+(B1:B5<=20),C1:C5,0))}
Obviously adjust the range to suit the data set, however if the whole of each column is to form part of the formula then you can simply adjust to:
{=SUM(IF((A:A<=20)+(B:B<=20),C:C,0))}
This will perform the calculation on all rows of data within the A, B and C columns. With either example remember to press Ctrl + Shift + Enter in order to trigger the array formula (as opposed to typing the { and }).

Median/average does not return the right values

Image for reference
I'm trying to achieve the following:
if(cell A1 is found in list 1), for each row in which it's found and if(C4:C10 > B4:B10), then median(the subtraction between C and B values, for every row that has text1).
I've tried two 2 different formulas:
1 - {=MEDIAN(IF(AND((C4:C10>B4:B10);(B4:B10=A1));(C4:C10-B4:B10)))}
2 - {=MEDIAN((C4:C10>B4:B10)*(B4:B10=A1)*(C4:C10-B4:B10))}
For median it always returns 0 and for the average really small values that aren't accurate. I'm sure the median and the averages aren't correct.
What would the problem be?
Also, how would I use something like:
{=MEDIAN((C4:C10>B4:B10)*(B4:B10=A1)*(C4:C10-B4:B10))}
If one the columns had text in some rows? (which isn't the case for the former problem, but it has arisen before).
text1
list 1 list 2 list 3
text2 1 5
text4 2 4
text1 4 6
text4 1 6
text1 4 5
text4 2 4
text1 3 3
You can't use AND function in these type of formulas because AND returns a single result (TRUE or FALSE) not an array as required.
Your second formula is closer but by multiplying all the conditions you will get zeroes for every row where the conditions are not met, hence skewing the results.
You can use either one of these similar versions:
=MEDIAN(IF((C4:C10>B4:B10)*(A4:A10=A1);C4:C10-B4:B10))
=MEDIAN(IF(C4:C10>B4:B10;IF(A4:A10=A1;C4:C10-B4:B10)))
both need to be confirmed with CTRL+SHIFT+ENTER
To handle text in columns B or C (and to make the formula ignore those rows but work otherwise) you can add an extra IF function like this
=MEDIAN(IF(C4:C10>B4:B10;IF(A4:A10=A1;IF(ISNUMBER(C4:C10-B4:B10);C4:C10-B4:B10))))
All formulas will work equally well with AVERAGE function in place of MEDIAN
Another way to get the MEDIAN while ignoring text is to use AGGREGATE function like this:
=AGGREGATE(17;6;C4:C10-B4:B10/(C4:C10>B4:B10)/(A4:A10=A1);2)
That doesn't need "array entry" but will only work in Excel 2010 or later versions. There's no simple equivalent for AVERAGE
17 denotes QUARTILE function - second quartile is the equivalent of median
See attached screenshot demonstrating the last two formulas with your sample data....and some added text
Supposing that the values in column C that is list 3 are bigger than those in column B that is list 2, then you can use the following formula:
=MEDIAN(IF((A4:A10=A1)*(C4:C10>B4:B10);C4:C10-B4:B10))
this is an array formula, so press ctrl+shift+enter to calculate the formula.
tell me if it doesn't work.

Looking for formula to extract specific values from a row containing numbers and blanks

I have a sheet with rows of data, with many columns. I am looking for help on a formula that will extract the sum of the smallest 3 numbers in a row based on the last 5 values entered. Note that not all the rows will have values for each column, so the first value found on each row will may be found in a different column.
To determine the sum of the smallest 3 I am using the formula =SUM(SMALL(B3:R3,{1,2,3})), Unfortunately, that formula is looking at the entire range. Again, I am looking for help that with a formula that will select only the last 5 values posted.
Here is simple example. The results for each line show the totals that should be determined. Again, it needs to look for the sum of the smallest 3 based on the last 5 posted (in the example below the range would be col. 1 thru 10, with col 10 having the latest postings).
Ex.
1.....2.....3......4......5.....6.....7.....8......9.....10...... Result
31.........44....51....36..........44...34....36....38.......106 (34+36+34)
35..31...44...40.....38...52..........42....37...............115 (37+38+40)
Hope this is understandable. I am looking for a formula solution vs a VBA macro solution because of my users. Thanks for any help!!
Now that you clarified the question, I have an answer for you. This is fairly ugly but it gets the job done. You might want to hide the columns with the intermediate results - or you could get adventurous and "nest" the expressions. This makes it really hard to understand / debug though. If there's a smarter way to do this I am always open to learning.
Assuming you have the data in columns A through J, starting in row 2, put the following in cell L2:P2:
=MATCH(9999, A2:J2,1)
=MATCH(9999,OFFSET($A2,0,0, 1, L2-1)) ... copy this by dragging right to the next 2 columns
=MATCH(9999,OFFSET($A2,0,0, 1, M2-1))
=MATCH(9999,OFFSET($A2,0,0, 1, N2-1))
=MATCH(9999,OFFSET($A2,0,0, 1, O2-1))
The first line finds the last cell with data in it; the next ones find the last cell "not including the last cell", and so they work backwards. The result is a number corresponding to the columns with data. For your example, this gives
10 9 8 7 5
9 8 6 5 4
Now we want to find the sum of the smallest 3 of these: put the following equation in cell Q2:
=SUM(SMALL(INDIRECT("RC["&P2-17&"]:RC["&L2-17&"]",FALSE),{1,2,3}))
Working from the inside out:
RC["&P2-17"] results in RC[-12], which is "the cell 12 to the left of this one".
That is the first of the "last five cells with data", cell E2
RC["&L2-17"] results in RC[-7], the last cell with data in this row
FALSE use "RC" rather than "A1" indexing
INDIRECT turn string into an address (in this case a range)
SMALL find the 3 smallest values in this range
SUM and add them together.
This formula did indeed give me 106, 115 for the example you provided.
I would hide columns L through P so you only see the result (and not the intermediate stuff).

In Excel 2007, how can I SUMIFS indices of multiple columns from a named range?

I am analysing library statistics relating to loans made by particular user categories. The loan data forms the named range LoansToApril2013. Excel 2007 is quite happy for me to use an index range as the sum range in a SUMIF:
=SUMIF(INDEX(LoansToApril2013,0,3),10,INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6))
Here 10 indicates a specific user category, and this sums loans made to that group from three columns. By "index range" I'm referring to the
INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6)
sum_range value.
However, if I switch to using a SUMIFS to add further criteria, Excel returns a #VALUE error if an index range is used. It will only accept a single index.
=SUMIFS(INDEX(LoansToApril2013,0,4),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
works fine
=SUMIFS(INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
returns #value, and I'm not sure why.
Interestingly,
=SUMIFS(INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,4),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
is also accepted and returns the same as the first one with a single index.
I haven't been able to find any documentation or comments relating to this. Does anyone know if there is an alternative structure that would allow SUMIFS to conditionally sum index values from three columns? I'd rather not use three separate formulae and add them together, though it's possible.
The sumifs formula is modelled after an array formula and comparisons in the sumifs need to be the same size, the last one mimics a single column in the LoansToApril2013 array column 4:4 is column 4.
The second to bottom one is 3 columns wide and the comparison columns are 1 column wide causing the error.
sumifs can't do that, but sumproduct can
Example:
X 1 1 1
Y 2 2 2
Z 3 3 3
starting in A1
the formula =SUMPRODUCT((A1:A3="X")*B1:D3) gives the answer 3, and altering the value X in the formula to Y or Z changes the returned value to the appropriate sum of the lines.
Note that this will not work if you have text in the area - it will return #VALUE!
If you can't avoid the text, then you need an array formula. Using the same example, the formula would be =SUM(IF(A1:A3="X",B1:D3)), and to enter it as an array formula, you need to use CTRL+SHIFT+ENTER to enter the formula - you should notice that excel puts { } around the formula. It treats any text as zero, so it will successfully add up the numbers it finds even if you have text in one of the boxes (e.g. change one of the 1's in the example to be blah and the total will be 2 - the formula will add the two remaining 1s in the line)
The two answers above and a bit of searching allowed me to find a formula that worked. I'll put it here for posterity, because questions with no final outcome are a pain for future readers.
=SUMPRODUCT( (INDEX(LoansToApril2013,0,3)=C4) * (INDEX(LoansToApril2013,0,1)="PTFBL") * INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6))
This totals up values in columns 4-6 of the LoansToApril2013 range, where the value in column 3 equals the value in C4 (a.k.a. "the cell to the left of this one with the formula") AND the value in column 1 is "PTFBL".
Despite appearances, it isn't multiplying anything by anything else. I found an explanation on this page, but basically the asterisks are adding criteria to the function. Note that criteria are enclosed in their own brackets, while the range isn't.
If you want to use names ranges you need to use INDIRECT for the Index commands.
I used that formula to check for conditions in two columns, and then SUM the results in a table which has 12 columns for the months (the column is chosen by a helper cell which is 1 to 12 [L4]).
So you can do if:
Dept (1 column name range [C6]) = Sales [D6];
Region (1 column name range [C3]) = USA [D3];
SUM figures in the 12 column monthly named range table [E7] for that 1 single month [L4] for those people/products/line item
Just copy the formula across your report page which has columns 1-12 for the months and you get a monthly summary report with 2 conditions.
=SUMPRODUCT( (INDEX(INDIRECT($C$6),0,1)=$D$6) * (INDEX(INDIRECT($C$3),0,1)=$D$3) * INDEX(INDIRECT($E7),0,L$4))

Resources