Break ties in RANKX Powerpivot formula - excel

I can rank my data with this formula, which groups by Year, Trust and ID, and ranks the Areas.
rankx(
filter(Table,
[Year]=earlier([Year])&&[Trust]=earlier([Trust])&&[ID]=earlier([ID])),
[Area], ,1,Dense)
This works fine - unless you have data where the same Area appears more than once in the same group, whereupon it gives all rows the rank of 1. Is there any way to force unique rank values? So two rows that have the same Area would be given the rank of 1 and 2 (in an arbitrary order)? Thank you for your time.

Assuming you don't have duplicate rows in your table, you can add another column as a tie-breaker in your expression.
Suppose your table has an additional column, [Name], that is distinct between your multiple [Area] rows. Then you could write your formula like this:
= RANKX(
FILTER(Table,
[Year] = EARLIER([Year]) &&
[Trust] = EARLIER([Trust]) &&
[ID] = EARLIER([ID])),
[Area] & [Name], , 1, Dense)
You can append as many columns as you need to get the tie-breaking done.

Related

Sum in pivot hierachy

I have the following DAX-formula to retrieve the opening and closing balance for a list of products.
=CALCULATE(MAX(transactions[Balance]);
FILTER(transactions;
transactions[ID] = MAX(transactions[ID])
)
)
This works on row level in my Pivot but when I group this och Product category level I only get one value and not the sum of all the product rows.
My data contains of rows for each transaction and each row have a columns with current balance.
How do I sum each row to get the group sum for the above category "00-01" 26784 and 283500?
One way to do this is to leverage an iterative function like a SUMX.
Assuming that your EndValue is the measure that you defined.
SUMX_Example := SUMX( VALUES ( transactions[ID] ) , [EndValue] )
Which will do the following:
Though VALUES ( transactions[ID] ) it will generate a list of your IDs
For each ID it will run your already created [EndValue] measure
Sum the result of each ID's end value
This is of course assuming [ID] does not cover categories. If ID does cross categories, then you would first do a SUMX using category, with another SUMX that does ID

How to compare one row to others in DAX in Excel

I have this table which has foreign keys from several other keys:
Basically, this table shows which students registered in which module run by which teacher in what term.
I want to query the following:
How many students have registered for more than one module run by a given tutor?
It will look something like this:
For example, Vasiliy Kuznetsov runs two modules: FunPro and NO. If one student registers for both of them, he is counted as one.
My sql oriented mind is telling me this: Count all the rows in which student_id and tutor_id are the same. For example, in one row student_id is 5 and tutor_id is 10, and the same is true for the third row. Then, I count it as one.
How can I do that with DAX formulas?
RowCount:=
COUNTROWS( ModuleRegistration )
StudentsWithTwoOrMoreRegistrations:=
COUNTROWS(
FILTER(
VALUES( ModuleRegistration[Student_ID] )
,[RowCount] >= 2
)
)
I refer to arguments positionally, thus the first argument to a function is (1), the second (2), and so on.
So, [RowCount] is trivial.
[StudentsWithTwoOrMoreRegistrations] is a bit more involved. DAX, being a functional language, is best understood inside-out.
FILTER() takes a table expression in (1) and evaluates a boolean predicate, (2), for each row in (1). It returns all rows from (1) for which (2) evaluates to true.
Our FILTER()'s (1) is VALUES( ModuleRegistration[Student_ID] ). VALUES() returns the unique rows from a field based on current filter context (it respects slicers and filters in the pivot table). Thus, we will return some subset of the unique list of [Student_ID]s.
Our FILTER()'s (2) is [RowCount] >= 2. For each [Student_ID] in (1), we'll evaluate [RowCount], checking how many times that student appears in ModuleRegistration. [RowCount] is evaluated in the combination of filter context from the pivot table (the [Faculty Name] field in your sample pivot provides filter context) and row context from FILTER()'s (1). Thus it counts how many times the student appears in ModuleRegistration for the [Faculty Name] on the pivot table row.
We check that [RowCount] is >= 2.
You've not indicated if your measure needs to handle grand totals, or how you might want to see that. If you need more help for the grand total to get it to behave the way you like, let me know.
Edit for grand total
There are a few ways you might want to handle grand totals. I'm gong to assume that you want a unique count of students.
StudentsWithTwoOrMoreRegistrations:=
COUNTROWS(
SUMMARIZE(
FILTER(
SUMMARIZE(
ModuleRegistration
,ModuleRegistration[Tutor_ID]
,ModuleRegistration[Student_ID]
)
,[RowCount] >= 2
)
,ModuleRegistration[Student_ID]
)
)
WTF happened to our measure?
Let's examine:
Starting with the innermost SUMMARIZE(). SUMMARIZE() navigates relationships outward from the table in (1) and groups by the columns listed in (2)-(N) (these don't have to be from the table in (1), but must be reachable by navigating relationships).
This is equivalent to the following in SQL:
SELECT
mr.Tutor_ID
,mr.Student_ID
FROM ModuleRegistration mr
We use FILTER() on this table like earlier. [RowCount] is evaluated in the combination of filter context from the pivot table and the row in the table, defined by our SUMMARIZE() above.
Now our row context is instead of just a student, a student-tutor pair. This pair will have a [RowCount] >= 2 when the student has taken more than one module from a tutor.
Our FILTER() returns the pairs which have a [RowCount] >= 2. This output table has two fields, [Tutor_ID] and [Student_ID], but we want to count distinct [Student_ID]s out of this.
Thus, we use the table from FILTER() as our (1) in the outer SUMMARIZE(). We group only by the values of [Student_ID]. We then count the rows of this table.
When only one [Faculty_Name] is in context, e.g. on a pivot table row, then our inner SUMMARIZE() is grouping by a single value of [Tutor_ID] and whatever [Student_ID]s are associated with it. This is identical to our earlier measure.
When we have many [Tutor_ID]s in context, like in the grand total, then we'll see the appropriate behavior of only counting each [Student_ID] once.

Calculated Measure Based on Condition in Dax

I have a requirement in Power Pivot where I need to show value based on the Dimension Column value.
If value is Selling Price then Amount Value of Selling Price from Table1 should display, if Cost Price then Cost Price Amount Should display, if it is Profit the ((SellingPrice-CostPrice)/SellingPrice) should display
My Table Structure is
Table1:-
Table2:-
Required Output:-
If tried the below option:-
1. Calculated Measure:=If(Table[Category]="CostPrice",[CostValue],If(Table1[category]="SellingPrice",[SalesValue],([SalesValue]-[CostValue]/[SalesValue])))
*[CostValue]:=Calculate(Sum(Table1[Amount]),Table1[Category]="CostPrice")
*[Sales Value]:=Calculate(Sum(Table1[Amount]),Table1[Category]="SellingPrice")
Tried this in both Calculated Column and Measure but not giving me required output.
Cost:=
CALCULATE(
SUM( Table1[Amount] )
,Table1[Category] = "CostPrice"
)
Selling:=
CALCULATE(
SUM( Table1[Amount] )
,Table1[Category] = "SellingPrice"
)
Profit:=
DIVIDE(
[Selling] - [Cost]
,[Selling]
)
ConditionalMeasure:=
IF(
HASONEFILTER( Table2[Category] )
,SWITCH(
VALUES( Table2[Category] )
,"CostPrice"
,[Cost]
,"SellingPrice"
,[Selling]
,"Profit"
,[Profit]
)
,[Profit]
)
HASONEFILTER() checks that there is filter context on the named field and that the filter context includes only a single distinct value.
This is just a guard to allow our SWITCH() to refer to VALUES( Table2[Category] ). VALUES() returns a table of all distinct values in the named column or table. So, a 1x1 table can be implicitly converted to a scalar, which we need in SWITCH().
SWITCH() is a case statement.
Our else condition in the IF() is just returning [Profit]. You might want something else, but it's unclear what should happen at the grand total level. You can leave this off, and the measure will be blank in IF()'s else condition.
I was thinking about this a little. I'm not sure why you have your categories on rows. Usually the data set would have columns like: item | CostPrice | SellingPrice | Profit. Then you can just use the columns to define your fields. The model becomes easier and more maintainable.

How to have a measure lookup a value based on the row and column context?

I need to Aggregate a number of multiplications which are based on the Row and Columns context. My best attempt at describing this is in pseudo-code.
For each cell in the Pivot table
SUM
Foreach ORU
Percent= Look up the multiplier for that ORU associated with the Column
SUMofValue = Add all of the Values associated with that Column/Row combination
Multiply Percent * SUMofValue
I tried a number of ways over the last few days and looked at loads of examples but am missing something.
Specifically, What won't work is:
CALCULATE(SUM(ORUBUMGR[Percent]), ORUMAP)*CALCULATE(SUM(Charges[Value]), ORUMAP)
because you're doing a sum of all the Percentages instead of the sum of the Percentages which are only associated with MGR (i.e., the column context)
Link to XLS
One way of doing that is by using nested SUMX. Add this measure to ORUBUMGR:
ValuexPercent :=
SUMX (
ORUBUMGR,
[PERCENT]
* (
SUMX (
FILTER ( CHARGES, CHARGES[ORU] = ORUBUMGR[ORU] ),
[Value]
)
)
)
For each row in ORUBUMGR you will multiply percent by ....
the sum of value for each row in Charges where ORUBUMGR ORU is the same as Charges ORU. Then you sum that product.

intersect cassandra rows

We have cassandra column family.
each row have multiple columns. columns have name, but value is empty.
if we have 5-10 row keys, how we can find column names that appear in all of these keys.
e.g.
row1: php, programming, accounting
row2: php, bookkeeping, accounting
row3: php, accounting
must return:
result: php, accounting
note we can not easily load whole row into the memory, because it may contain 1M+ columns
solution not need to be fast.
In order to do intersection of several rows, we will need to intersect two of them first, then to intersect the result with third and so on.
Looks like in cassandra we can query the data by column names and this is relatively fast operation.
So we first get Column Slice of 10k rows. Making list of column names (in PHP Cassa - put them in array). Then select those from second row.
Code may be looking like this:
$x = $cf->get($first_key, <some column slice>);
$column_names = array();
foreach(array_keys($x) as $k)
$column_names[] = $k;
$result = $cf->get($second_key, $column_slice = null, $column_names);
// write result somewhere, and proceed with next slice
You columns names are sorted and you can create an iterator for each row (this iterator load portion of date at once, for example 10k of columns). Now put each iterator into a priority queue (by the next column name). If you take for queue the k times the iterator with the same column names, this is common names between all rows, in the other case we move to the next element and return iterators to queue.
You could use a Hadoop map/reduce job as follows:
Map output key = column name
Map output value = row key
Reducer counts row keys for each column and outputs column name & count to a CF with the following schema:
key : [column name] {
Count : [count]
}
You can then query counts from this CF in reverse order. The first record will be the max, so you can keep iterating until a value is < max. This will be your intersection.

Resources