Excel: AVERAGE IF - excel

Lets say I have an Excel sheet such that:
Column 1 contains salaries
Column 2 contains gender (M/F)
How can I calculate the average salary for females?

=AVERAGE(IF(B1:B10="F",A1:A10))
entered as an array function (ie using Shift-CTRL-Enter rather than just Enter)

Allthough the answer is already answered/accepted I can't resist to add my 2 cents:
Sums and averages normally are displayed at the bottom of a list. You can use the SUBTOTAL() function to calculate sum and average and specify to include or exclude "hidden" values, i.e. values suppressed by a filter. So the solution could be:
create a formula =SUBTOTAL(101,A2:A6) for the average
create a formula =SUBTOTAL(109,A2:A6) for the sum
create an autofilter on the Gender column
Now, when you filter for "F", "M" or all, the correct sum and average will always be computed.
Hope that helps - Good luck

To do it without an array formula just use this:
=SUMIF(B:B,"F",A:A)/COUNTIF(B:B,"F")

Answered question, solid formulas - BUT - beware of conditional averages based on numbers:
In a very similar situation I tried:
"=AVERAGE(IF(F696:F824<0;E696:E824))" (shift control enter) - In English this asked Excel to calculate an average on all numbers in column E if a result in column F was negative (a loss) - e.g. "calculate an average for all items where a loss occured". (no circular reference)
When asked to calculate an average for all items where a gain (x>0) occured, Excel got it right. However, when the conditional average was based on a loss - Excel produced a huge error (7.53 instead of 28.27).
I then opened exactly the same document in Open Office, where Calc got the (correct) answer 28,27 from the same Array formula.
Recalculating the whole thing in steps in Excel (first new column of only losses, new for only gains, new column for only E-values where a loss/gain occured, then a "clean" (unconditional) average calculation, produced the correct values.
Thus, it should be noted that Excel and Open Office produce different answers (and Excel 2007, Swedish language version, gets them wrong) in some cases of conditional averages.
(sorry for my long cautionary tale - but be a bit careful when the condition is a number would be my advice)

You can put filter in top line of your sheet. Then Filter only Fs from Column2, then calculate average of values.

Or you can add additional column with female salaries only.
The formula would be like: =IF(B1 = "Female";A1)
Then you simply calculate average of newly created column.

Salaries gender
2500 M 0 2500*0
2400 F 1 2400×1
2300 F 1 2300×1
sum =2 sum=4700 average=4700/2
Maybe it be complex.

Related

Excel Calculate a running average on filtered data with criteria

I am trying to calculate a running average on a filtered data table. All other posts online use Sumproduct(Subtotal) on the entire range but do not calculate a row by row running average
I am stuck on how to calculate columns C and D.
If column B (Score) > 0, I want to sum and average it under column C (Average Win)
If column B (Score) < 0, I want to sum and average it under column D (Average Loss)
The table is filterable by column A (Type) and the results should look as follows
Progress so far:
I have figured out how to calculate a Cumulative score based on filtered data. However this does not fully solve my problem. I appreciate any help!
=SUBTOTAL(3,B3)*SUBTOTAL(9,B$3:B3)
SUBTOTAL(3,B3) checks if the current row is visible, SUBTOTAL(9,B$3:b3) sums the values.
Final update needed
Jos - Thank you for your detailed explanation on how subtotal() works. I learned a ton through your explanation and will continue to study it. This is my first time being exposed to structured referencing so some of the syntax is a bit confusing to me still
The last formula I need is a running win % column where a Win is defined by score > 0. Please see the picture below
My assumptions believe that the same formula would work, except that we average a 1 or 0 in each row instead of the [Score] column.
Using the prior solution, why can't we modify the output of your prior solution to calculate a running win %?
[...] IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Win])))),0)
Where [Win] is a helper column with the outputs 1 for win, 0 for loss.
This could be done by saying
if([#score]>0,1,0)
Instead of averaging out the actual #Score, this would average out a column of 1's and 0's with the desired output 0%, 50%, 66%, etc.
I am aware that the solution I provided does not work but I am trying to embrace the correct logic. I still struggle to understand how these structured column references are calculated on a row by row basis.
For example: Average(If([Score]>0,[Score])
How is this calculated on a row by row basis? When A3 does If([Score] > 0,), does this equal If({-10}>0)? When on A4, does If([Score]>0) equal If({-10,20} >0)? Thank you for your patience and help thus far.
I disagree with your result for Average Loss for the last row of your unfiltered table (surely -9.33...?), but try this for Average Win:
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
Same formula for Average Loss, changing [Score]>0 to [Score]<0.
Explanation:
Using the data you provided and assuming:
The table's top-left cell is in A1
The table is filtered on the Type column for "A"
In order to determine which rows are filtered, we must pass an array of range references - i.e. for each cell within a chosen column of the table - to the SUBTOTAL function. It's a touch unfortunate that such an array of range references can only be generated via a volatile function (INDIRECT or OFFSET), but here, unless we resort to helper columns, we are left with no choice.
INDEX([Score],1)
simply returns a range reference to the first cell within the Score column. When using Excel tables, it's preferable not to write formulas which include a mixture of structured and non-structured referencing, even if that results in slightly longer expressions. So here, for example, we would not reference A2 within the formula.
ROW([Score])-MIN(ROW([Score]))
generates an array of integers from 0 up to one fewer than the number of rows in the table, i.e.
{0;1;2;3;4}
and so
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
becomes
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(A2,{0;1;2;3;4},)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
OFFSET then generates an array of range references (though note that you will not be able to 'see' this step within the Evaluate Formula window - rather, an array of #VALUE! errors is displayed):
=IFERROR(AVERAGE(IF(SUBTOTAL(3,{A2;A3;A4;A5;A6}),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
SUBTOTAL then determines which of these range references is filtered (note that care must be given here to the choice of first parameter), returning the relevant Boolean, so that:
SUBTOTAL(3,{A2;A3;A4;A5;A6})
resolves to:
{1;1;1;0;1}
And so we now have:
=IFERROR(AVERAGE(IF({1;1;1;0;1},IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
and the rest is straightforward.
So, I would use averageifs().
=averageifs(B:B,B:B,">=1",A:A,"A")
is one example, note I have added the control of Type A in the example.
See:

How can you in Excel total predefined values depending on cell values

I have a spreadsheet with two tabs. The first one contains Vehicle Types and a numeric score value.
Second Tab has like a variety of these vehicle types and what should be the total score. Depending on the vehicle types present in the respective neighbour cell.
See images below for illustration.
Is there a way via formula to get the total, in Column B in sheet 2, of the corresponding numeric values of column a from sheet 1?
For example, as per the illustration B2 in sheet would total 3; whereby in sheet 1 bus has a score of 1 and car 2.
Update:
As per the answer below, I have used the formula;
=SUMPRODUCT(ISNUMBER(FIND(" "&sheet1!A$2:A$4&" "," "&SUBSTITUTE(A4,CHAR(10)," ")&" "))*sheet1!B$2:B$4)
However, I am unfortunately getting zero as the value. Changing the line breaks in column A in sheet2 I am duly able to get the total. Is there a way to do it so irrespective of how the list is presented in the column the total will work?
I think you are after something like this:
Formula in E2:
=SUMPRODUCT(VLOOKUP(FILTERXML("<t><s>"&SUBSTITUTE(D2,CHAR(10),"</s><s>")&"</s></t>","//s"),A$2:B$4,2,FALSE))
If one has O365 you could just use SUM instead since it would auto-CSE the formula.
If you don't have Excel 2013 or later, you could try the following as another option (shorter but not my favourite):
=SUMPRODUCT(ISNUMBER(FIND(" "&A$2:A$4&" "," "&SUBSTITUTE(D2,CHAR(10)," ")&" "))*B$2:B$4)

How to calculate an average grade from mix of number grades and letter grades whose values are determined via lookup table

Data Table
Lookup Table
I want to calculate the average grade score of each pupil using an array formula and without the use of helper columns. If all grade are numbers its simple enough to average those, but when letter grades are introduced into the mix I need to convert these to their number values using the lookup table and then average the whole row for each pupil.
So far I was able to take one row and convert all grades that are letters to their corresponding values using the formula below.
=TRANSPOSE(INDIRECT("N"&(MATCH(TRANSPOSE(B2:E2),Table2[Grade],0)+1)))
which returns:
={#N/A,#N/A,7,3}
I then thought, great I got numbers in place of the letter grades, lets just "average" this result like so:
=AVERAGE(TRANSPOSE(INDIRECT("N"&(MATCH(TRANSPOSE(B2:E2),Table2[Grade],0)+1))))
which gives #N/A, which I do not understand when the following is completely allowed:
=AVERAGE({1,2,3})
One way to solve the case is to add all numerical grades to the 2 column Grade table you have in column M and N (technically it is not using helper columns?), then sort the Grade column in ascending order as shown below:
Then you can use the following array formula to find the average grade in cell F2:
=AVERAGE(LOOKUP(B2:E2,Tbl_Grade[Grade],Tbl_Grade[Attainment]))
Being an array formula, you MUST press Ctrl+Shift+Enter upon finishing the formula in the formula bar otherwise it will not function correctly. Then you can simply drag the formula down to apply across.
The logic is to use LOOKUP function to return the corresponding attainment for each given grade regardless if it is an actual number or a letter.
Let me know if you have any questions. Cheers :)
Using VLOOKUPs seems like it would be much simpler:
=AVERAGE(B2,C2,VLOOKUP(D2,$M$2:$N$7,2,FALSE),VLOOKUP(E2,$M$2:$N$7,2,FALSE))
If you can tolerate changing "A*" to say "A+" (or something else that doesn't get interpreted as a wildcard) this may get you going:
=(SUM(B2:E2)+SUMPRODUCT(SUMIF(Table2[Grade],"="&B2:E2,Table2[Attainment])))/COUNTA(B2:E2)
The first SUM sums the numeric grades; the SUMPRODUCT(SUMIF(...)) sums the letter grades. Then divide the total by the number of grades (COUNTA).
Hope that helps

Getting the PERCENTRANK.INC in PowerPIvot/DAX

I have been tasked with converting a a worksheet from standard Excel into PowerPivot. However, I have hit a roadblock with the PERCENTRANK.INC function which is not available in DAX. I have come close to replicating it using a formula, but there are definite differences in the calculation.
Does anyone know how Excel calculates PERCENTRANK.INC()?
Formula in cell D2: =(COUNT($A$2:$A$10)-B2)/(COUNT($A$2:$A$10)-1)
Formula in cell B2: =RANK.EQ(A2,$A$2:$A$10,0)
Formula in cell C2: =PERCENTRANK.INC($A$2:$A$10,A2)
Edit:
It's seems strange to me that there are so many "standard" ways to calculate PERCENTRANK that has have slightly different results.
Using your example of your 9-number set of 1,2,3,4,4,6,7,8,9, depending on which "authority" I used, the third value (3) had a Percent Rank of 25.0%, 27.0%, 27.8% or 30.0%.
Obviously we'll go with the one that gives your desired result, matching PERCENTRANK.INC.
   PERCENTRANK.INC is calculated as:
          [count of values lower than the given value]
                             ÷
    [count of all values in the set excluding the given value]
so, if our range of 1,2,3,4,4,6,7,8,9is in A1:A9, we could use this formula in B1:
=COUNTIF($A$1:$A$9,"<"&A1)/(COUNT($A$1:$A$9)-1)
...and copy or "fill" it down for results:
0.0%, 12.5%, 25.0%, 37.5%, 37.5%, 62.5%, 75.0%, 87.5%, 100.0%
Original Answer
I think you just want to calculate the PERCENTRANK for current row
value. Based on the logic for PERCENTRANK, we can add a RANK column in
table and achieve same logic based on this column.
Create a Rank column.
Rank = RANKX(Table6,Table6[Value])
Then create the PctRank based on the Rank column.
PctRank = (COUNTA(Table6[Name])-Table6[Rank])/(COUNTA(Table6[Name])-1)
(Source)
For reference here is the DAX formula based off ashleedawg's answer which includes ignoring cells with no values in them.
=ROUNDDOWN(COUNTROWS(FILTER('Lookup_Query','Lookup_Query'[Entrances]<EARLIER('Lookup_Query'[Entrances]) && ISBLANK('Lookup_Query'[Entrances])=FALSE()))/(COUNTROWS(FILTER('Lookup_Query',ISBLANK('Lookup_Query'[Entrances])=FALSE()))-1),3)

In Excel 2007, how can I SUMIFS indices of multiple columns from a named range?

I am analysing library statistics relating to loans made by particular user categories. The loan data forms the named range LoansToApril2013. Excel 2007 is quite happy for me to use an index range as the sum range in a SUMIF:
=SUMIF(INDEX(LoansToApril2013,0,3),10,INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6))
Here 10 indicates a specific user category, and this sums loans made to that group from three columns. By "index range" I'm referring to the
INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6)
sum_range value.
However, if I switch to using a SUMIFS to add further criteria, Excel returns a #VALUE error if an index range is used. It will only accept a single index.
=SUMIFS(INDEX(LoansToApril2013,0,4),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
works fine
=SUMIFS(INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
returns #value, and I'm not sure why.
Interestingly,
=SUMIFS(INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,4),INDEX(LoansToApril2013,0,3),1,INDEX(LoansToApril2013,0,1),"PTFBL")
is also accepted and returns the same as the first one with a single index.
I haven't been able to find any documentation or comments relating to this. Does anyone know if there is an alternative structure that would allow SUMIFS to conditionally sum index values from three columns? I'd rather not use three separate formulae and add them together, though it's possible.
The sumifs formula is modelled after an array formula and comparisons in the sumifs need to be the same size, the last one mimics a single column in the LoansToApril2013 array column 4:4 is column 4.
The second to bottom one is 3 columns wide and the comparison columns are 1 column wide causing the error.
sumifs can't do that, but sumproduct can
Example:
X 1 1 1
Y 2 2 2
Z 3 3 3
starting in A1
the formula =SUMPRODUCT((A1:A3="X")*B1:D3) gives the answer 3, and altering the value X in the formula to Y or Z changes the returned value to the appropriate sum of the lines.
Note that this will not work if you have text in the area - it will return #VALUE!
If you can't avoid the text, then you need an array formula. Using the same example, the formula would be =SUM(IF(A1:A3="X",B1:D3)), and to enter it as an array formula, you need to use CTRL+SHIFT+ENTER to enter the formula - you should notice that excel puts { } around the formula. It treats any text as zero, so it will successfully add up the numbers it finds even if you have text in one of the boxes (e.g. change one of the 1's in the example to be blah and the total will be 2 - the formula will add the two remaining 1s in the line)
The two answers above and a bit of searching allowed me to find a formula that worked. I'll put it here for posterity, because questions with no final outcome are a pain for future readers.
=SUMPRODUCT( (INDEX(LoansToApril2013,0,3)=C4) * (INDEX(LoansToApril2013,0,1)="PTFBL") * INDEX(LoansToApril2013,0,4):INDEX(LoansToApril2013,0,6))
This totals up values in columns 4-6 of the LoansToApril2013 range, where the value in column 3 equals the value in C4 (a.k.a. "the cell to the left of this one with the formula") AND the value in column 1 is "PTFBL".
Despite appearances, it isn't multiplying anything by anything else. I found an explanation on this page, but basically the asterisks are adding criteria to the function. Note that criteria are enclosed in their own brackets, while the range isn't.
If you want to use names ranges you need to use INDIRECT for the Index commands.
I used that formula to check for conditions in two columns, and then SUM the results in a table which has 12 columns for the months (the column is chosen by a helper cell which is 1 to 12 [L4]).
So you can do if:
Dept (1 column name range [C6]) = Sales [D6];
Region (1 column name range [C3]) = USA [D3];
SUM figures in the 12 column monthly named range table [E7] for that 1 single month [L4] for those people/products/line item
Just copy the formula across your report page which has columns 1-12 for the months and you get a monthly summary report with 2 conditions.
=SUMPRODUCT( (INDEX(INDIRECT($C$6),0,1)=$D$6) * (INDEX(INDIRECT($C$3),0,1)=$D$3) * INDEX(INDIRECT($E7),0,L$4))

Resources