I would like the average of Column B based on two criteria. That it happened last year and a text criteria from another column. Average by year In the example I have a Year column for test purposes but I don't want to add it to all the data sheets.
=AVERAGEIFS(Table1[Unit], Table1[Date], "="&YEAR(TODAY())-1, Table1[Text], "Up")
throws a DIV/0 error.
I believe I need to define the Date range by year.. like (YEAR(Table1[Date]) but it doesn't work.
=AVERAGEIFS(Table1[Unit], (YEAR(Table1[Date]), "="&YEAR(TODAY())-1, Table1[Text], "Up")
I can get an IF statement to work on a single cell but is there are way to get this to work in a column?
Scott
You can't use formula when defining range so you either have to use helper column or something like this:
=AVERAGEIFS(Table1[Unit],Table1[Date],">="&(DATE(YEAR(TODAY())-1,1,1)),Table1[Date],"<="&(DATE(YEAR(TODAY())-1,12,31)),Table1[Text],"Up")
It checks if date is less than 2022/12/31 (DATE(YEAR(TODAY())-1,12,31))and more than 2022/01/01 (DATE(YEAR(TODAY())-1,1,1))
Result ((3+7)/2=5):
The SUMPRODUCT() function provides some really useful approaches to problems like this.
In your sample data, the formula =YEAR(Table1[Date])=2023 should return an array {FALSE,FALSE,TRUE,TRUE,FALSE,TRUE}.
In your sample data, the formula =Table1[Text]="Up" should return an array {TRUE,FALSE,FALSE,TRUE,FALSE,FALSE}.
SUMPRODUCT() allows us to do some interesting things with those:
If I apply a math operation to those arrays Excel automatically converts them to binary; {0,0,1,1,0,1} and {1,0,0,1,0,0} respectively. That math function can be doing something like multiplying. Or if I want to use them as-is in a function I can just use "--" to force a math operation, that makes them negative and back to positive. In our example we'll be multiplying the arrays so Excel will take care of it for us.
I can do a binary AND operation on the two arrays by using multiplication. Thus: = (YEAR(Table1[Date])=2023) * (Table1[Text]="Up") is actually {0,0,1,1,0,1} * {1,0,0,1,0,0} which in turn equals {0,0,0,1,0,0}. And the 1's in this result array represent the rows that meet both criteria.
=SUMPRODUCT((YEAR(Table1[Date])=2023) * (Table1[Text]="Up")) will equal the count of rows that met both criteria. Which is only 1 in your example.
=SUMPRODUCT((YEAR(Table1[Date])=2023) * (Table1[Text]="Up" * Table1[Unit])) is going to sum the result of array multiplication of {0,0,0,1,0,0} * {4,4,5,5,4,4}. In your sample data that results in 5.
So the formula =SUMPRODUCT((YEAR(Table1[Date])=2023) * (Table1[Text]="Up") * Table1[Unit]) / SUMPRODUCT((YEAR(Table1[Date])=2023) * (Table1[Text]="Up")) actually is "sum of rows that matched" divided by the "count of rows that matched".
Notice that a conditional array like Table1[Text]="Up" MUST be wrapped in its own parenthesis before it can be added (OR function) or multiplied (AND function) with another array.
You may want to wrap that entire formula in an IFERROR() function so you can display a friendlier message when the count is zero. For instance:
=IFERROR(SUMPRODUCT((YEAR(Table1[Date])=2023) * (Table1[Text]="Up") * Table1[Unit]) / SUMPRODUCT((YEAR(Table1[Date])=2023) * (Table1[Text]="Up")),"None")
You will want to fully debug the formula before nesting it in IFERROR() because the IFERROR() function will conceal other errors than just an occasional divide by zero.
This will all seem very cumbersome the first few times you use this approach but if you encounter these kinds of criteria problems often in excel, I promise that taking the time to understand SUMPRODUCT() on logical arrays will pay long-term dividends. Once understood it gives you a robust capability to use SUM, COUNT, and AVERAGE given multiple criteria, that can be any mix of AND and OR criteria.
Related
Context: I'm mapping some excel sheets into web backend code.
Circular reference works well and super fast within excel, we currently do 1000 iterations in excel for each place of circular reference, and each recalc is practically instant. But when converted to backend code, it's not as fast, so I'm trying to collapse the circularity into formulae.
I was able to collapse some of the circular references, but here's a tricky one. It essentially boils down to this:
subtotal1 = parameter1 * total
subtotal2 = parameter2 * total
subtotal3 = parameter3 * total
total = subtotal1 + subtotal2 + subtotal3
Each subtotal depends on the total and vice versa.
If you do algebraic transformations you'll realize you can never extract the formula for any one argument, because there are over 2 layers of interconnectedness that cannot be unfurled.
Ideally I'd like to it down to a formula like this:
sub1 = <a formula that calculates sub1 directly and does not include sub2, sub3 or total>
How can we collapse this kind of circular references into formulae and avoid doing 1000 iterations in the code?
If your values are in A1:C1 you could use:
=SUM(REDUCE(A1:C1,SEQUENCE(8),LAMBDA(x,y,SUM(x)*x)))
The number mentioned in the sequence is the number of iterations, but this will quickly result in a number too large for Excel.
I used 1, 2, 3 for this and above SEQUENCE(8) results in #NUM! (at least using the mobile app version of Excel).
What this formula does is start with the values in A1:C1, sums these values and multiplies it with it's individual values, creating an array of 3 numbers (subtotal1-3). The the last calculated value (x) is the new start point for the same calculation sum of x * x. This repeats untill the sequence ends.
To make visible what it does you can use:
=REDUCE(A1:C1,SEQUENCE(8),LAMBDA(x,y,VSTACK(x,SUM(TAKE(x,-1))*TAKE(x,-1))))
Which will spill the arrays (starting at the start value, then the iterations, without showing the summed array value).
First mentioned formula does the same without stacking and it's wrapped in sum to get the total.
My question is that I want to return a list of values in column B in sheet 2 (or in this case NBA Players) that contain the value "PG" in cell A3 in sheet 1, from column A in sheet 2. Not only do I want it to match "PG" but I also want the value to have a salary (Column C) that is between $7100 (Cell B2 in Sheet 1) and $8000 (Cell C2) in Sheet 1). Any help would be appreciated.
you are either going to need to use an array formula or a function that returns array like calculations. I will suggest using the AGGREGATE function. Avoid using full comm/row references within an array formula or a function performing array like calculations or you may wind up bogging down your system with excessive calculations.
The AGGREGATE function is made up a several individual functions. Depending which one you choose, it will perform array operations. I am going to suggest that formula 14. What the following example will do is generate a list of results sorted from smallest to largest that ignores error values, then return the first value from the list. The thing we will list is the row number for a row that matches your ALL your criteria. So the basics of AGGREGATE looks like this:
AGGREGATE(Formula #, Error/hidden handling #, Formula, parameter)
The hardest part of this is coming up with the right formula. In the numerator you put the thing you are looking for. In the denominator you place your TRUE/FALSE condition checks. Separate each condition check with *. * will act as an AND function. The thing that makes this work is that TRUE/FALSE convert to 1/0 when they are sent through a math operation. So anything you do not want is FALSE. and anything divided by FALSE becomes divide by 0 which in turn generates an error. Since AGGREGATE is set to ignore error, only things that meet your condition will exist in the list and since they are being divided by TRUE which is 1, your thing remains unchanged. So the aggregate function is going to start to look like:
AGGREGATE(14,6,ROW(some range)/((Condition 1)*Condition 2)*...*(Condition N)),1)
So as eluded to before, 14 set the AGGREGATE to sort a list in ascending order. 6 tells AGGREGATE to ignore errors, and the 1 tells AGGREGATE to return the first item in its sorted list. If it was 2 instead of 1 it would return the 2nd position. If you ask for a position that is greater than the number of items in the list, there will be an error produced by AGGREGATE which does not get ignored.
So now that there is some understanding of what AGGREGATE does lets see how we can apply this to your data. For starters lets assume your data is in rows 2:100 and row 1 is a header row. You will have to adjust the references to suit your data.
CONDITION 1
LEFT($A$2:$A$100,2)="PG"
Checks to see if the first two characters are PG. based on the data in your screen shot, PG was either to the left of the / or was the only entry. There was also an observation that there was only one / in the cells of column A. If you also need to check if it after the / and with the assumption that it can only be on one side and not both at the same time you could use this alternative for your condition check:
(LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG")
In this case the + is performing the task of an OR function. The caveat mentioned earlier is important because if both sides are TRUE then you wind up with TRUE+TRUE which becomes 1+1 which is 2 and we only want to divide by 1 or 0. Though to counter that you could go with:
MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)
CONDITION 2
Check that the salary in C is less than or equal a value 80000.
($C$2:$C$100<=80000)
CONDITION 3
Check that the salary in C is greater than or equal a value 71000.
($C$2:$C$100>=71000)
Now lets put this all together to get a list of row numbers that meet your conditions:
AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))
Now provided I did not screw up the bracketing in that formula, you can place that formula in a cell and copy it down until it produces errors. As you copy it down, the only thing that will change is the A1 in ROW(A1). It acts like a counter. 1,2,3 etc. so you will get a list of row numbers that meet your criteria. Now we need to convert those row numbers to names.
To find the names, the INDEX function is your friend here. Because it is not part of an array formula or inside a function performing array like calculations, full column reference can be used. So we take our formula that is generating row numbers and place it inside the INDEX function to give:
INDEX(B:B,Row Number)
INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1)))
Now if you hate seeing error codes when you have copied down further then results you can place the whole thing inside and IFERROR function to give:
IFERROR(formula,What to display in case of an error)
So for blank entries:
IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))),"")
and custom message:
IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))),"NOT FOUND")
So now you just need to adjust the references to suit your data. If your data is located on another sheet remember to include the sheet name. A reference to B3:C4 would become:
Sheet1!B3:C4
and if the sheet name has a space in it:
'Space Name'!B3:C4
A colleague has an array of values in "X4:X38". Since these are in a table which may be filtered, she wants to use the subtotal function to sum them - but wants all of the values to be rounded up first.
={SUM(ROUNDUP(X4:X38,0))}
works perfectly well. However,
{SUBTOTAL(9,ROUNDUP(X4:X38,0))}
Generates a generic "The formula you typed contains an error" message. I have tried various obvious things, like putting additional brackets around the "roundup" section, etc.
Any help would be appreciated.
You can do this without a helper column by using this formula:
=SUMPRODUCT(SUBTOTAL(2,OFFSET(X4:X38,ROW(X4:X38)-MIN(ROW(X4:X38)),0,1)),ROUNDUP(X4:X38,0))
OFFSET effectively breaks the range down in to individual cells which are passed to SUBTOTAL function and that returns an array of 1 or 0 values based on whether each cell is visible after filter or not - this array is multiplied by the rounded values to give the overall sum of the rounded visible values.
Another way is to use AGGREGATE function like this
=SUMPRODUCT(ROUNDUP(AGGREGATE(15,7,X4:X38,ROW(INDIRECT("1:"&SUBTOTAL(2,X4:X38)))),0))
Given the complexity a helper column might be the preferable approach
After investigation, looks like this is not possible without helper column.
Add a helper column which rounds the individual values in column X, e.g. type the following formula into cell Y4 and drag down to Y38:
= ROUNDUP(X4,0)
And then instead of
= SUBTOTAL(9,ROUNDUP(X4:X38,0))
use:
= SUBTOTAL(9,Y4:Y38)
Then if necessary you can just hide the helper column. Of course the helper column doesn't have to be column Y, it could be any column, e.g. a column far to the right of where the data ends.
I was wondering how to represent the criteria argument in the function =SUMIF(range, criteria) as instead of ">0" which represent greater than zero, which would add all numbers in the range that are greater than zero. I was wondering how to make it within the range of zero to 8, so "8>0", or something, but I have been googling for hours and cannot find a solution that doesn't involve doing whacky things with SUMIFS which involves other arrays which I do NOT want to get into because I feel there's a simple solution to this that I'm missing...
Theres ">=NUM" "<=NUM" ">NUM" and "<.NUM"
how do you make it require two of these?
Is there any documentation on this anywhere?
I agree with #user3240704 that SUMIFS is the way to go.
If you insist on using SUMIF only then you can use the following logic:
take the sum of the entire range
deduct the sum of values <-10
further deduct the sum of values >0
Which is the inverse of saying
only the sum the values >=-10 and <=0
The formula is:
=SUM(A1:A11)-SUMIF(A1:A11,">0",A1:A11)-SUMIF(A1:A11,"<-10",A1:A11)
E.g.
More info here
=SUMIFS(A1:A11,A1:A11,">=-10",A1:A11,"<=0")
SUMIFS has many (in this case two) conditions, broken down as follows:
=SUMIFS(A1:A11 - SUM() whatevers in A1:A11
, - That match the following conditions
A1:A11,">=-10" - ALL numbers in A1:A11, that are greater than OR equal to -10
, - AND
A1:A11,"<=0" - ALL numbers in A1:A11, that are less than OR equal to 0
)
I am trying to use index match functions to determine the appropriate rate for the below table.
So for example a consumer loan that is for a person that owns property, the car is 2 years or less in age and the total loan to value ratio is less than 140% should return a value of 5.15%
I believe this is what you wanted...
I would use a series of nested if functions to evaluate which column of LTV I would want the value to come from.
"That is what is done in the AND( ) part. If the value is greater than the 110% and smaller than 140% let's do the Index Match on the 110% Column, Otherwise do it on the 140% Column."
You could extend this for more columns with more IFs in the false condition.
Then it is a simple INDEX match with concatenation. It searches for the three parameters all concatenated in a single range of concatenations.
Hope it helped.
Proof of Concept
In order to achieve the above I had to make a minor edit to your header to be able to distinguish between the two 140% columns.
The functions used in this answer are:
AGGREGATE function
MATCH function
INDEX function
ROW function
IFERROR function
I placed the main part of the formula inside the IFERROR function as a way of dealing with things that may be out of range or when not all the input have been provided. I then assumed that what you were basing your search on would be provided in a series of cells. In my example I assumed the questions would be asked in the range H3 to K3 and I place the results in L3.
The main concept is centered around the INDEX function. I specified the index range as being the height of your table and the width of the percentage rates. Or for this example D2:F9.
=IFERROR(INDEX($D$2:$F$9,row number, column number),"Not Found")
That is the easy part. That more challenging part is determining the row and column number to look in. Lets start with the column number as it is the slightly easier of the two. I assumed the ratio to look for, or rather the header of the column to look in would be supplied. I basically used this equation to determine the column number:
=MATCH(K3,$D$1:$F$1,0)
which in layman's terms is which column between D and F, counting column D as 1, has the value equal to the contents of K3. So now that there is a formula to determine the column, we can drop that into our original formula and wind up with:
=IFERROR(INDEX($D$2:$F$9,row number,MATCH(K3,$D$1:$F$1,0)),"Not Found")
Now we just need to determine the row number. This is the most complex operation. We are going to basically make a bunch of logical checks and take the first row that matches all the logical checks. The premise here is that a logical check is either TRUE or FALSE. In excel 0 is false an every other integer is TRUE. So if we multiply a series of logical checks together, only the one that is true in all cases will be equal to 1. The first logical check is the loan type. it will be followed by the living status and then the vehicle age.
=(H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)
now if you put that into an array formula you will get a series of true false or 1/0. We are going to use it inside an AGGREGATE function with a special feature. The AGGREGATE function will perform array like calculation for some of its functions. We are going to use function 15 which will do this. We are also going to tell the aggregate function to ignore all errors, which is what the 6 does. So in the end what we wind up doing is dividing each row number by the logical check. If the logical check is false or 0, it will generate a Div/0! error which aggregate will choose to ignore. In the end we wind up with a list of row which match our logical check. We then tell the aggregate that we want the first result with the ,1. so we wind up with a formula that looks like:
=AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)
While this does provide us with the row number we want, we need to adjust it to make it an index number. In order to do this you need to subtract the number of header rows. In this case 1. So the index row number is given by this formula:
=AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)-1
And when we substitute that back into the earlier equation for the row number, we wind up with the final equation of:
=IFERROR(INDEX($D$2:$F$9,AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)-1,MATCH(K3,$D$1:$F$1,0)),"Not Found")