I have been tasked with converting a a worksheet from standard Excel into PowerPivot. However, I have hit a roadblock with the PERCENTRANK.INC function which is not available in DAX. I have come close to replicating it using a formula, but there are definite differences in the calculation.
Does anyone know how Excel calculates PERCENTRANK.INC()?
Formula in cell D2: =(COUNT($A$2:$A$10)-B2)/(COUNT($A$2:$A$10)-1)
Formula in cell B2: =RANK.EQ(A2,$A$2:$A$10,0)
Formula in cell C2: =PERCENTRANK.INC($A$2:$A$10,A2)
Edit:
It's seems strange to me that there are so many "standard" ways to calculate PERCENTRANK that has have slightly different results.
Using your example of your 9-number set of 1,2,3,4,4,6,7,8,9, depending on which "authority" I used, the third value (3) had a Percent Rank of 25.0%, 27.0%, 27.8% or 30.0%.
Obviously we'll go with the one that gives your desired result, matching PERCENTRANK.INC.
PERCENTRANK.INC is calculated as:
[count of values lower than the given value]
÷
[count of all values in the set excluding the given value]
so, if our range of 1,2,3,4,4,6,7,8,9is in A1:A9, we could use this formula in B1:
=COUNTIF($A$1:$A$9,"<"&A1)/(COUNT($A$1:$A$9)-1)
...and copy or "fill" it down for results:
0.0%, 12.5%, 25.0%, 37.5%, 37.5%, 62.5%, 75.0%, 87.5%, 100.0%
Original Answer
I think you just want to calculate the PERCENTRANK for current row
value. Based on the logic for PERCENTRANK, we can add a RANK column in
table and achieve same logic based on this column.
Create a Rank column.
Rank = RANKX(Table6,Table6[Value])
Then create the PctRank based on the Rank column.
PctRank = (COUNTA(Table6[Name])-Table6[Rank])/(COUNTA(Table6[Name])-1)
(Source)
For reference here is the DAX formula based off ashleedawg's answer which includes ignoring cells with no values in them.
=ROUNDDOWN(COUNTROWS(FILTER('Lookup_Query','Lookup_Query'[Entrances]<EARLIER('Lookup_Query'[Entrances]) && ISBLANK('Lookup_Query'[Entrances])=FALSE()))/(COUNTROWS(FILTER('Lookup_Query',ISBLANK('Lookup_Query'[Entrances])=FALSE()))-1),3)
Related
I am trying to calculate a running average on a filtered data table. All other posts online use Sumproduct(Subtotal) on the entire range but do not calculate a row by row running average
I am stuck on how to calculate columns C and D.
If column B (Score) > 0, I want to sum and average it under column C (Average Win)
If column B (Score) < 0, I want to sum and average it under column D (Average Loss)
The table is filterable by column A (Type) and the results should look as follows
Progress so far:
I have figured out how to calculate a Cumulative score based on filtered data. However this does not fully solve my problem. I appreciate any help!
=SUBTOTAL(3,B3)*SUBTOTAL(9,B$3:B3)
SUBTOTAL(3,B3) checks if the current row is visible, SUBTOTAL(9,B$3:b3) sums the values.
Final update needed
Jos - Thank you for your detailed explanation on how subtotal() works. I learned a ton through your explanation and will continue to study it. This is my first time being exposed to structured referencing so some of the syntax is a bit confusing to me still
The last formula I need is a running win % column where a Win is defined by score > 0. Please see the picture below
My assumptions believe that the same formula would work, except that we average a 1 or 0 in each row instead of the [Score] column.
Using the prior solution, why can't we modify the output of your prior solution to calculate a running win %?
[...] IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Win])))),0)
Where [Win] is a helper column with the outputs 1 for win, 0 for loss.
This could be done by saying
if([#score]>0,1,0)
Instead of averaging out the actual #Score, this would average out a column of 1's and 0's with the desired output 0%, 50%, 66%, etc.
I am aware that the solution I provided does not work but I am trying to embrace the correct logic. I still struggle to understand how these structured column references are calculated on a row by row basis.
For example: Average(If([Score]>0,[Score])
How is this calculated on a row by row basis? When A3 does If([Score] > 0,), does this equal If({-10}>0)? When on A4, does If([Score]>0) equal If({-10,20} >0)? Thank you for your patience and help thus far.
I disagree with your result for Average Loss for the last row of your unfiltered table (surely -9.33...?), but try this for Average Win:
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
Same formula for Average Loss, changing [Score]>0 to [Score]<0.
Explanation:
Using the data you provided and assuming:
The table's top-left cell is in A1
The table is filtered on the Type column for "A"
In order to determine which rows are filtered, we must pass an array of range references - i.e. for each cell within a chosen column of the table - to the SUBTOTAL function. It's a touch unfortunate that such an array of range references can only be generated via a volatile function (INDIRECT or OFFSET), but here, unless we resort to helper columns, we are left with no choice.
INDEX([Score],1)
simply returns a range reference to the first cell within the Score column. When using Excel tables, it's preferable not to write formulas which include a mixture of structured and non-structured referencing, even if that results in slightly longer expressions. So here, for example, we would not reference A2 within the formula.
ROW([Score])-MIN(ROW([Score]))
generates an array of integers from 0 up to one fewer than the number of rows in the table, i.e.
{0;1;2;3;4}
and so
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
becomes
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(A2,{0;1;2;3;4},)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
OFFSET then generates an array of range references (though note that you will not be able to 'see' this step within the Evaluate Formula window - rather, an array of #VALUE! errors is displayed):
=IFERROR(AVERAGE(IF(SUBTOTAL(3,{A2;A3;A4;A5;A6}),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
SUBTOTAL then determines which of these range references is filtered (note that care must be given here to the choice of first parameter), returning the relevant Boolean, so that:
SUBTOTAL(3,{A2;A3;A4;A5;A6})
resolves to:
{1;1;1;0;1}
And so we now have:
=IFERROR(AVERAGE(IF({1;1;1;0;1},IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
and the rest is straightforward.
So, I would use averageifs().
=averageifs(B:B,B:B,">=1",A:A,"A")
is one example, note I have added the control of Type A in the example.
See:
I have 2 sheets in a workbook (Sheet1, Sheet2).
Sheet 2 contains a table (Named Table1) with 5 columns:
Takeaways
Household
Clothing
Fuel
Groceries
On sheet one, I have 2 columns:
Expense Name
Expense Total
Now, what I am trying to do is:
Set the range for the Expense Name (Range 1)
Set the range for the Expense Total (Range 2)
Compare Range 1 with the respective column in the table and only add up the values for matches
For example, in Range 1 (B6:B16):
BP
Caltex
McDonalds
KFC
In Range 2 (C6:C16):
300
400
200
150
Now, all I want to do is add up the values for the Takeaways (McDonalds, KFC) and exclude anything that DOES NOT match the criteria.
So my sum total will be all occurrences of Takeaways - provided they are listed in my table - 350 in this case.
But I cannot seem to get the formula to work.
I used these sources:
https://exceljet.net/excel-functions/excel-sumifs-function
Selecting a Specific Column of a Named Range for the SUMIF Function
and ended up with this formula:
=SUMIF($B$6:$B$16;Table1[Takeaways];C6:C16)
This source:
https://excelchamps.com/blog/sumif-sumifs-or-logic/
and ended up with this formula:
=SUM(SUMIFS(C6:C16;B6:B16;Table1[Takeaways]))
Both formulae return 0.
BUT, with BOTH of them, if I change Table1[Takeaways] to "McDonalds", then it correctly identifies every occurrence of the word "McDonalds" in Range 1.
EDIT:
I have updated the formulae above to match the images below.
This is the table that contains the references:
This table contains the data:
Formula:
Cell C4 (Next to Takeaways): =SUMIF($B$6:B$16;Table1[Takeaways];C6:C16)
Cell C5 (Next to Fuel): =SUM(SUMIFS(C6:C16;B6:B16;Table1[Fuel]))
It appears that ONLY BP is being detected in the formula.
This is a an output table when I use the formulae with a single cell reference and not a table or used range:
Formula:
Cell F4 (Next to BP): =SUMIF($B$6:B$16;"BP";C6:C16)
Cell F5 (Next to Caltex): =SUM(SUMIFS(C6:C16;B6:B16;"Caltex"))
Cell F6 (Next to McDonalds): =SUMIF($B$6:B$16;"McDonalds";C6:C16)
Cell F7 (Next to KFC): =SUM(SUMIFS(C6:C16;B6:B16;"KFC"))
If I understand correctly what you're trying to achieve, I think your setup is not right conceptually.
It looks like you're trying to track expenses, and each expense (or payee) is allocated to a category ("Takeaways", "Household" etc.). From a relational-model point of view, your second table (which defines the category for each expense/payee) should only have two columns (or variables): Expense Name and Expense Category.
The table you set up ('Sheet 2') uses the categories (i.e., possible values) as different columns (i.e., variables). But there's only variable, namely the "Expense Category", and the categories themselves are the possible values.
If you set it up like that, the problem changes: you can add a dependent column to your first table that shows the category for each payee (or "Expense Name"), using a VLOOKUP() from the second table.
You can then sum the expenses for all payees matching that category.
Note: I've created the illustration using LibreOffice Calc, so there might be some small differences, but the logic is the same.
Without seeing the data in L and K I can't give you a full answer - but likely it's to do with the way you're pulling your Array
Try something similar to this
=SUMPRODUCT(SUMIFS($L$11:$L$43,$K$11:$K$43,CHOOSE({1,2},Takeaways,"anything else you wanted to sum")))
Remember SUMIFS is for multiple criteria, so if you're only calculating one, you'll need =SUMPRODUCT(SUMIF(
The way the above works is with vertical vectors only, but changing your named ranges so the table of 2 columns is 2 named ranges instead should be okay - unless it's part of your requirements
Table 2 would become expense_Name and expense_Total etc
I was about to close this as a duplicate of my own question here but there is a bit of a difference in using a named range I think. However the logic behind this follows more or less the same approach.
Working further on my partial solution below I derived the following formula:
=SUMPRODUCT(COUNTIF(Table1[Takeaways];Range1)*Range2)
The COUNTIF() part counts the number of occurrences of the cell value in your table. Therefore make sure there are no duplicates in your table. If the value is present in the table the result of COUNTIF() will be 0. This way we create a matrix of 1's and 0's. By multiplying and the use of SUMPRODUCT() we force excel to perform matrix calculations and return the correct result.
Partial solution
I used the following formula:
=SUMPRODUCT(ISNUMBER(MATCH(Range1;Table1[Takeaways]))*Range2)
The formula does the following:
The MATCH()checks if the value in Range1 is present in your table and returns the position of the matching value in your table.
The ISNUMBER() checks if a match is found by checking if the MATCH() fucntion returned a number
Multiplying this with Range2 forces matrix calculation, using the SUMPRODUCT() function
EDIT:
This worked for a really limited sample. As soon as I added the fourth row to my data the formula stopped working as intended. See screenshot:
It took the first two values into the sum correctly, the fourth is not taken into account.
I am wondering, if there is a general way to express, that only visible rows of a formula should be taken into account.
If I have for example a formula sumif($E5:$E100; "ABC"; $F5:F100) it would be very helpful, if there would be a way to express, that the given ranges should only take visible cells into account. I could imagine that a kind of prefix can be specified to a range construct like % or that like. For example the formula then would look like sumif(%$E5:%$E100; "ABC"; %F5:%F100) to make clear, that in the given ranges only visible rows should be taken into account.
Same would then for example be for sum(%A1:%A100) which would mean, that in the range between A1 and A100 only visible cells should be taken to sum up the cells.
The point is, that this construct could be taken inside any kind of formula, no matter what it is.
Thanks in advance
Georg
Generically to sum sumrange based on a match in criteriarange.....but only for visible rows you can use this formula: =SUMPRODUCT((criteriarange=criteria)+0,SUBTOTAL(109,OFFSET(sumrange,ROW(sumrange)-MIN(ROW(sumrange)),0,1,1))) The first part (criteriarange=criteria)+0 just checks the criteria for each row and returns 1 for a match or 0 OFFSET returns an "array of ranges" with each range in this case being a single cell from the sum range. SUBTOTAL can process that and with the sum function (109) gives the "sum" (i.e. the value) of each cell, only when visible. – SUMPRODUCT then multiplies the two ranges and sums the result, effectively giving you the sum of visible rows where the criteria matches
Try This
=SUMPRODUCT(($E$5:$E$100="ABC")+0,SUBTOTAL(109,OFFSET($F$5:$F$100,ROW($F$5:$F$100)-MIN(ROW($F$5:$F$100)),0,1,1)))
In my source data I've got a columns with data for every month and for few years.
In my working spreadsheet I want to sum the source data with 4 criteria. 3 of them are constant and one is dynamic (Month&Year). Depends on the chosen month and year I want to sum correct column in source data.
I've found a similar topic - link below:
SUMIF dynamically change summing column
However, I am getting #Values error if I input a formula. I've check that all data is text and compare with Exact function as well.
Below is my formula:
=SUMIFS(INDEX(IBRACT[#All];0;MATCH(Sheet3!G$1;IBRACT[#Headers];0));IBRACT[Entity];$A$1;IBRACT[SKU];$D14)
IBRACT is the name of table.
Below is the link to screen of evaluation (in this example I wanted to sum 6th column which in the spreadsheet is column "F"). Next step of evaluation shows #Value.
https://imgur.com/JHq80BM
Have anybody any idea how to solve this problem?
Best,
Wiktor
IBRACT[#All] includes the header row which is one row more than the two criteria ranges, hence the #VALUE! error. Just change the sum range to only include the data body range.
=SUMIFS(INDEX(IBRACT; 0; MATCH(Sheet3!G$1; IBRACT[#Headers]; 0)); IBRACT[Entity]; $A$1; IBRACT[SKU]; $D14)
I have a question regarding the RANK function in MS Excel 2010. I have a large worksheet whose rows I want to rank based on the values in a column. These values can be positive or negative. I found helpful advice here which explains how to rank the values in a column while excluding all values that equal zero from the ranking and the ranking count. They use the following formula:
IF(O24<0, RANK(O24,$O$24:$O$29) - COUNTIF($O$24:$O$29,0), IF(O24=0, "", RANK(O24,$O$24:$O$29)))
This works great, but it would be even better if I could rank the values only if a corresponding value in the same row but a different column meets certain criteria.
Is something like this possible and how would I do it? How would I update the example formula above to make the change work? Thank you very much in advance for your help.
P.S.: I tried putting in a table but it didn't really work, sorry...
You can use COUNTIFS function to rank based on a condition in another column, e.g. this formula in row 24 copied down [edited to include extra IF)
=IF(O24=0,"",IF(N24="x",COUNTIFS(O$24:O$29,">"&O24,O$24:O$29,"<>0",N$24:N$29,"x")+1,""))
That will rank high to low where column N = "x", ignoring zero values
See this example columns N and O contain random values - press F9 to re-generate new random values and formula results in column Q will change accordingly
It is certainly possible to keep creating more complex formulas whenever you're adding new criteria on which to rank. However by creating intermediary columns with single-step formulas, you'll make your spreadsheet easier to comprehend and easier to add new criteria or edit the existing.
My suggestion is to create a column that excludes the zero's (let's assume this is in column P): =IF(O24 = 0, "", O24)
Then in column R, to eliminate negative values (this step is unnecessary, but your original formula does something similar): =IF(P24 = "", "", P24 - MIN(0, MIN($O$24:$O$29)))
Now in column S, add your newest criteria: =IF(OR(R24="", [enter newest criteria here]), "", R24)
Finally, column T performs the ranking of only the selected rows: =IF(S24="", "", RANK(S24, S$24:S$29))
If exposing columns P, R and S is bothersome, you can always hide them.
A rewording of the answer from barry houdini, using table format.
Value_Col is the column with the values to rank. Group_Column is the column with the group by values, to rank within groups
=COUNTIFS([Value_Col], ">"&[#[Value]], [Value_Column],"<>0", [Group_Column], [#[Group]]) +1