I have 2 tables.
First table: dimensional table to show available units of cars at start of selling cycle and for how long these units will be available.
Second table: to show how many cars were sold on a given month within their "available cycle".
I'd like to compare the "selling behaviour" within each cycle. Thus, I want to display the total initial units available next to the units sold at each stage within the cycle. The second dimension works fine, but not the first one.
This is what I get:
And this the desired output (note rows 4 and 5 for available_units)
I tried the below DAX code without success:
SumAvailableUnits:=CALCULATE(SUM([available_units]),FILTER(ALL(Table1[month_within_cycle]),[month_within_cycle]>=MAX([months_available])))
First, DAX Formatter is your friend. You may like writing unreadable single line measures, but no one else likes reading them.
I've also taken the liberty of cleaning up your table names and adding fully qualified column references. (Ignoring that your dimension isn't a pure dimension, as it holds numeric values that you aggregate in a measure)
SumAvailableUnits :=
CALCULATE (
SUM ( DimCar[available_units] ),
FILTER (
ALL ( FactSale[month_within_cycle] ),
FactSale[month_within_cycle] >= MAX ( DimCar[months_available] )
)
)
And immediately we see a problem. With the fully qualified column references, it is clear that you're trying to filter the lookup table (the one side) by the base table (the many side). In Power Pivot for Excel, we do not have bi-directional relationships (though they're available in Power BI and coming for Excel 2016). Our relationships' filter context only flows from the lookup table to the base table, typically from the dimension to the fact.
Additionally, your DimCar, by holding [available_units] and [months_available] encodes an implicit assumption that a specific [car_id] can only ever refer to a single, unchanging lot. You will never see another row with [car_id] = 1. This strikes me as highly unlikely. Even if it is the case the better solution is a model change.
In general, anything that goes onto a row or column label should come from a dimension, not a fact. Similarly, anything you're aggregating in a measure should live in a fact table. The usual exception is dimension counts - either bare, or as a denominator in a measure. Following these will get you 80% of the way in terms of dimensional modeling. You can see the tables and model diagram I've ended up with in the image below.
Here are the measure definitions
SumAvailableUnits:=SUM( FactAvailability[available_units] )
SumSold:=SUM( FactSale[cars_sold] )
Here are my source tables, my model diagram with relationships, and a pivot table built from these pieces and the measures above. Note the source of [month_within_cycle] in the pivot.
Finally, you might notice that my grand total behaves in a different way than in your original. Since the values are repeated monthly, we get a much larger grand total. If you need to instead end with the sum from the latest month (which it looks like you have in your sample), you can use an alternate measure, provided below. I don't understand why this would be your desired grand total, but you can achieve it fairly easily. Personally, I'd probably blank the measure at the grand total level.
SumAvailableUnits - GrandTotal:=
SUMX(
TOPN(
1
,FactAvailability
,FactAvailability[month_within_cycle]
,0
)
,FactAvailability[available_units]
)
This uses SUMX() to step through the table provided, defined by TOPN(). TOPN() returns the first row (in this case) in FactAvailability, including ties, after sorting by [month_within_cycle], out of all rows available in the filter context. In the context of a specific month, you get all the rows associated with that month - identical to the simple sum. In the grand total context, you get the rows associated with the last month.
SUMX() iterates over that table and accumulates the values of [available_units] in a sum.
Related
Brief:
I have a large dataset, inside of which are Individual customer orders by item and quantity. What I'm trying to do is get excel to tell me which order numbers contain exact matches (in terms of items and quantities) to each other. Ideally, I'd like to have a tolerance of say 80% accuracy which I can flex to purpose but I'll take anything to get me off the ground.
Existing Solution:
At the moment, I've used concatenation to pair item with quantity, pivoted and then put the order references as column and concat as rows with quantity as data (sorted by quantity desc) and I'm visually scrolling across/down to find matches and then manually stripping in my main data where necessary. I have about 2,500 columns to check so was hoping I could find a more suitable solution with excel doing the legwork on identification.
Index/matching works at cross referencing a match for the concatenation but of course, the order numbers (which are unique) are different so its not giving me matches ACROSS orders.
Fingers crossed!
EDIT:
Data set with outcomes
As you can see, the bottom order reference has no correlation to the orders above it so is not listed as a match to the orders above but 3 are identical and 1 has a slightly different item but MOSTLY matches.
Due to performance issues I need to remove a few distinct counts on my DAX. However, I have a particular scenario and I can't figure out how to do it.
As example, let's say one or more restaurants can be hired at one or more feasts and prepare one or more menus (see data below).
I want a PowerPivot table that shows in how many feasts each restaurant was present (see table below). I achieved this by using distinctcount.
Why not precalculating this on Power Query? The real data I have is a bit more complex (more ID columns) and in order to be able to pivot the data I would have to calculate thousands of possible combinations.
I tried adding to my model a Feast dimensional table (on the example this would only be 1 column of 2 rows). I was hoping to use that relationship to be able to make a straight count, but I haven't been able to come up with the right DAX to do so.
You could use COUNTROWS() combined with VALUES().
Specifically, COUNTROWS() will give you the count of rows in a table. That means COUNTROWS is expecting a table is input. Here's the magic part: VALUES() will return a table as results, and the table it returns are the distinct values in the table/column that you provide as the argument for VALUES().
I'm not sure if I'm explaining it well, so for the sample data you provided, the measure would look like this (assuming the table is named Table1):
Unique Feasts:=COUNTROWS(VALUES('Table1'[Feast Id]))
You can then create a pivot table from Powerpivot, and drag Restaurant Id into Rows, and drag the measure above into Values. Same result as DISTINCTCOUNT, but with less performance overhead (I think).
I wonder if it's possible to create a table like this:
I have calculated the equity ratio for two companies shown in the column "Calculated field 1". Now I would like to create the average and the minimum value of this column for each company! as shown in the table (red numbers).
For clarification, the column C, row C5 to C26 show the average for company 1.
The row C28 to C49 show the average for company 2.
Any ideas how to proceed this?
Average:=
AVERAGEX(
ALL(Datasrc[Year])
,[Calculated field 1] // Seriously? Come up with a better name.
)
Minimum:=
MINX(
ALL(Datasrc[Year])
,[Calculated field 1]
)
*X() functions take a table expression as their first argument. They step row by row through this table, evaluating the expression in the second argument for each row, and accumulate it. AVERAGEX() accumulates with an average, MINX() with a minimum.
ALL() returns unique elements of a table, column, or set of columns stripped of context. Since we are only calling ALL() on the [Year] column, that is the only column whose context we remove. The context from [Stock Ticker] remains in place.
These two measures are unreasonable in the grand total level.
Ninja edit: There's likely a better way to write the measures, but with no insight into what [Calculated field 1] is doing, I can't make suggestions toward that. *X() functions force single threaded evaluation in the formula engine. It's not a bad thing, but you take a performance hit over queries which can utilize only the storage engine. If you could rewrite it to a CALCULATE(SUM(),ALL(Datasrc[Year])) / ALL(Datasrc[Year]), you'd get faster evaluation for sure, but I have no clue what your data looks like.
I am using Excel Powerpivot with data in two separate tables. Table 1 has ITEM level data with a BRAND characteristic. Table 2 has BRAND level data. The two tables are linked by the BRAND key. The measure I am using is non addable. i.e. the sum of the ITEMS does not equal the BRAND. The pivot is set up with ITEMS nested under BRANDS in the rows and the Measure in the column.
Excel assumes that I want to summarize ITEM to a BRAND level by applying SUM, MAX, MIN, AVG, etc. I would like to return the actual values from the appropriate ITEM or BRAND level table and not apply any calculations to the values. Is this possible?
If what you are effectively trying to do is produce a different result for the Brand rows (e.g. blank()) then the answer is to write a further measure that does a logic check to determine whether or not the row in question is an ITEM or a BRAND.
= IF (HASONEVALUE(table1[Item]), [Measure], Blank() )
Bear in mind that this will work for your current pivot but may not be adaptable to all pivots.
This assumes that you have explicitly created a measure called [Measure] and you are not just dragging the numeric column into the values box. If not you can create the initial [Measure] something like this:
= Sum(table1[Value])
Where Value is the column you want to use in the measure. Although you have used a sum, if it relates to a single item which has a single row in the table it will give the desired result.
I have run into performance problems with MDX measure calculations for a summary report, using SQL Server 2008 R2.
I have a Person dimension, and a related fact table containing multiple records per person. (Qualifications)
Eg [Measures].[Other Qual Count] would give me the number of qualifications of a certain type.
Each person could have multiple, so [Measures].[Other Qual Count] > 1 for one person.
However on my summary report I would like to indicate this as 1 per person only. (To indicate the number of persons with Other qualifications.)
The summary report rolls up the values against some other dimensions including an unknown Region hierarchy (it can be one of 3 hierarchies).
I have done this as follows:
MEMBER [Measures].[Other Count2]
AS
SUM(
EXISTING [Person].[Staff Code].[Staff Code].Members,
IIF([Measures].[Other Count] > 0, 1, NULL)
)
However, I have to create several more derived measures - deriving from each other, and all at Person level to avoid unwanted multiple counts. The query slows down from <1 second to 1min+ (my goal is <3s).
The reason for all the derivations is a lot of logic to determine within which one of 6 mutually exclusive column a person will be reported in.
I have also tried to create a Cube Calculation, but this gives me the same value as [Other Count].
SCOPE (({[Person].[Staff Code].[Staff Code].MEMBERS}, [Measures].[Has Other Qual]));
THIS = ([Person].[Staff Code].[Staff Code], [Measures].[Has Other Qual]).Count;
END SCOPE;
Is there a better MDX/Cube calculation that can be used, or any suggestions on improving performance?
This is unfortunately my first time working with MDX and ran into this problem close to a deadline, so I am trying to make this work if possible without changes to the cube.
I have resolved the issue by changing the cube, which was simpler than expected.
On the Data Source View, I created a named query which summarizes the existing fact table at Person level. I also derive all the columns which I will need on my reports.
Treating this named query as a separate fact table, I added a measure group for it and that resolved all my problems.