Dynamic Percentile Analysis Across Multiple Categories - PowerPivot / DAX - excel

I've spent a a lot of time trying to find a solution to the following issue but I haven't been able. There are similar threads to this issue both here and on other forums but they don't seem to be applicable. Please let me know any best demonstrated practices regarding posting on this forum that I may be going against.
I would like to be able to dynamically (and hopefully in as simple way as possible) create measures (ideally NOT via calculated columns) in power pivot to be able to carry out percentile analysis (e.g., value associated with top quartile, top quintile, third decile, etc etc) on different subsets of my data (in a pivot table). For example, I might want to create the percentile based on the yearly sales associated with a shop (although the records I have are based on monthly, or another time period).
Here is what this data could look like as an example, as well as what the results would be on this data (I did this jammily using excel). I know that there is a way to do this using calculated columns but I want to try and do it using measures (e.g., maybe using a combination of sumx, percentiles, top n??).
In case you're not able to view the picture of my data, my data is structured as such:
===============================================================================================
**Shop ID** ## **Value** ## **Metric**## **Period** (e.g., mm / yy) ## **Franchised or Co Owned** ## **Year** ## **Quarter**
===============================================================================================
1 50 etc etc please see screenshot! thank you
2 70
3 90
Additional explanation on data
Shop ID could have many entries
Value is the value for each metric - the record is based around having a value for each metric for each shop id for each month (or other time period)
Metric could be things like sales, ebitda, car count, etc etc
Period is typically month
Shop status could be "Co - Owned" or "Franchised"
Year and Quarter are based off the period
I want to be able to get percentile values for sales in a given period (e.g., total yearly sales for a given year, total quarterly sales etc) for whatever slicer i have going on for the current pivot table.
Super grateful for any help!
Thanks,
Louis

OK, I think I found an answer. Something like this formula might work:
PERCENTILEX.INC(ALLSELECTED(Facts[ID]),SUMX(ALLSELECTED(Facts[Period]),[Sum Values]),[Percentile Definition])

Related

excel PowerPivot Auto Calculated Measures & Columns

After looking at a few similarish questions I figured I needed something more specific so asking here. I will start by explaining the situation:
The Setup
I have a Store which sells Cakes, Cookies and Wine. I have the weekly sales data of each product sorta like this:
Product ID
Product Name
Quantity
Value
Week Ending
1
Ginderbread
2
£4
13/01/22
2
Chocolate chip
5
£25
13/01/22
3
Red Wine Bottle
1
£10
13/01/22
4
Sponge Cake
3
£9
13/01/22
Currently every week's data is stored within the same table, with me using a Week filter to show only the week i'm interested in.
Using this Data I created PivotTables that shows the sales of each category, with the ability to drill down to show the specific products. Table looks something like this:
Category
Quantity
Value
Cakes
2
£4
Cookies
7
£29
Wine
1
£10
The issue
I now want to stick in a new calculated column that shows the Value as a %. E.g The total value for the previous table was £43, so Cookies is about 67%. If I drill down, it would show the Chocolate Chip record as 80% and Gingerbread as 20%
I imagine doing this would be easier if each individual week's data was on a different table, but I got a lot of weeks and I also want to do tables showing the sales for over a period of time. Plus I don't know of a way to merge the "value" and "quantity" columns, etc instead of having 1 for each week being shown.
any advice would be appreciated
Create an extra column in the source table (prior to filtering) entitled "perc" calculated as the corresponding value for each row divdied by the total value across all rows (se pic. / eqn. for first row below) --
=E2/$E$6
No calculated fields required - just include perc as the mesaure of interest in your pivot table, with value setting as 'sum':
The reason why this worked is because of the common denominator - which allows one to sum ratios on a 1:1 basis.
Devising a calculated field using the standard 'fields, items & sets' functionality for ordinary pivot tables would not be feasible / possible as far as I am aware. You would need to move into the realm of power pivots and data models - which is not too complicated (readily accesible directly from the field list per below) - however, I see this as unnecessary complication for the task at hand.
Side notes:
Using table names in your functions is sometimes more convenient when entering, albeit may appear tricky at first when reviewing - first eqn above becomes:
=[#Value]/Table1[[#Totals],[Value]]

Pivot Table project - Avoid using many INDEX and MATCH functions that make Excel crash

I need some help with an Excel Project that's giving me headaches. I succeeded to achieve everything I wanted but the result is too heavy for Excel and it crashes all the time. I'm over-using the INDEX and MATCH functions on large tables (50 000+ lines) and Excel doesn't like it. I'm looking for a way to do the same thing in a lighter way for Excel.
Here's what I achieved : I created a report that helps me analyzing my employees's performance VS their billing targets. To create such a report, I used a Pivot Table.
That Pivot Table needs this information as its source :
Each sales that every employee made (amount in $ and date)
The hourly rate of each employee (which changes for every period, see TABLE1 below)
The billing target for each employees (which changes for every period, see TABLE1 below)
Here's my setup. I have 3 tables :
TABLE1 (See attached image) - A table where I manually input data for each of my employees (hourly rate and billing target). Their billing target and hourly rate change every period. So, each period has a different line and I indicate the first day of the period and the last day of the period.
TABLE2 (See attached image) - Table that contains sales data exported from another software I use. Each line represents an amount sold by an employee to a customer on a specific date. This table is pretty heavy and contains more than 50 000 lines. Moreover, the last 2 columns of this table use Index and Match functions to get the right hourly rate and the right billing target from TABLE1. That means that each of those 50 000 lines uses the INDEX and MATCH functions twice… This part is too heavy for Excel and I need a workaround.
Moreover, TABLE2 is getting refreshed every few days with new data coming from my other software (an ERP). So the solution I need to find must take that into account and must be permanent (I try to avoid steps that will have to be done everytime I refresh TABLE2 with new data).
TABLE3 - A Pivot Table that uses TABLE2 as its data source. I use the slicer to select the name of an employee and a timeline to specify which months I want to display. Then the Pivot Table shows my employee's statistics grouped by months. The main statistic is the amount of "billed hours" for each employee, which is in reality the amount of sales made by that employee, divided by their hourly rate on a specific date.
My thoughts :
It is absurd that TABLE2 uses that many INDEX and MATCH functions. For example, if Employee1 made 500 sales between 2020-07-01 and 2020-07-31 (the same month, thus the same period, thus the same hourly rate and billing target), there will be 500 different lines that will use INDEX and MATCH to get the same hourly rate and billing target from TABLE1. That leads to a lot of duplicated calculation and a lot of duplicated data.
Would it be possible for a Pivot Table Calculated Field to use INDEX and MATCH in its formula? And would it be lighter for Excel to do so?
Another way would be to add, at the bottom of TABLE2, 12 lines per year (1 for each month) for every employee where I would write their hourly rate and the billing target. That way, the Pivot Table would be able to display an hourly rate and a billing target for each month, for each employee. That solution would work and would be lighter for Excel, but it would create a high risk of making mistakes while manually inputting the data.
I'm open to all suggestions including VBA!
Thank you very much for your precious time!
EDIT : FORMULA
As requested, here's my INDEX AND MATCH formula that is in TABLE2 and gets the hourly rate from TABLE1 :
=INDEX(TAB_Employee_Data[[#All];[Hourly_Rate]];MATCH([#[Date (Cell)]]; IF(TAB_Employee_Data[[#All];[Name]]=[#[Employee(Cell)]];TAB_Employee_Data[[#All];[First day of the period]]);1))
TAB_Employee_Data is the tab that contains "TABLE1".
I translated the names of the fields since all my work is in French.
This formula does the following : it searches the name of an employee in TABLE1 and finds the period which fits the date of a line in TABLE2.
Also, to work properly, I need to sort the lines of TABLE1 in chronological order.
TABLE 1 :
TABLE 2:

PowerBI DAX: logic to use aggregated table as parameter in functions or another workaround to calculate dataset KPI filtered by any field?

In PowerBI, I need to create a Performance Indicator (KPI) measure which evaluates dataset values in a scale from 0 to 1, with target (1) being the MAX value in a 20 years history. It's a national airport trip records open database. The formula is basically [value]/[max value].
My dataset has a lot of fields and I wish I could filter it by any of these fields, with a line chart showing the 0-1 indicator for each month based on the filters.
This is my workaround test solution:
Table 1 - Original dataset: if I filter something here, below tables also update (there are more fields to the left, including YEAR and MONTH
Table 2 - Reference to original dataset, aggregating YEAR-MONTH by the sum of "take-offs" (decolagens)
Table 3 - Reference to above (sum) table, aggregating MONTH by the max of "take-offs" (decolagens)
Table 4 - 'Sum table' merged to 'Max table' by MONTH as new table: then do [Value]/[Max] and we've got the indicator
So if i filter the original dataset by any fields, all other tables update accordingly and the indicators always stays between 0-1, works like a charm.
TL;DR
The problem is: I need to create a dashboard of this on Power Bi. So I need this calculation to be in a measure or another workaround.
My possible solution: by pure DAX code in the measure field, to produce Tables 2 and 3 so I'll divide the month sum values by their month max value (which will both be produced according to PowerBi dashboard slicers) and get the indicator dinamically produced.
I'm stuck at: I don't understand how can I reference a sum/max aggregate table in dax code. Something like = SUM (dataset[take-offs]) / MAX (SUM (dataset[take-offs])). Of course these functions do not work like that, but I hope I made my point clear: how can I produce this four table effect with a single measure?
Other solutions are welcome.
Link to the original dataset: https://www.anac.gov.br/assuntos/dados-e-estatisticas/dados-estatisticos/arquivos/DadosEstatsticos.csv
It's an open dataset, so I guess there's no problem sharing it. Please help! :)
EDIT: please download the dataset and try to solve this. Personally I think it's a quality statistics doubt that will eventually help others. The calculation works, it only needs a Power Bi Measure port.
Add the ALL formula:
Measure = SUMX(ALL('Table'),[Valor])/SUM('Table'[Max])
Example

Weighted average price of a product per day in Pivot Table

I am having issues translating the following formula to a pivot table; either through a regular pivot table, or through DAX and powerpivot.
=SUMPRODUCT((C$2:C$11)*(D$2:D$11)*(A$2:A$11=A2)*(B$2:B$11=B2))/SUMIFS(D$2:D$11,A$2:A$11,A2,B$2:B$11,B2)
The background is, I have a number of products that appear on an e-commerce site, and I need to find out their price per day. However, these prices change daily, based on things like promo codes, visitor location etc. Therefore, I need their weighted price based on the number of visitors that saw a particular price.
Can anyone help with this translation, or alternatively, offer a better way to approach this problem?
PS- I need it in a pivot table due to the volume of data. At 250,000 rows, standard Excel cannot handle this formula.
The following is in Excel 2010 sans Powerpivot. However, the general approach should work:
Explanation:
I added a column that multiplies the Prices and Visits. The pivot table uses Dates, then Product SKU as the row labels. Then I added a calculated field that divides the Price*Visits by the Visits.

Excel PowerPivot DAX Calculated Field

I think I've got a relatively easy problem here on my hands, just having trouble getting it to work. Let me preface this by saying I'm new to DAX.
Consider the following PowerPivot Data Model :
It consists of a Sales Records table, which is joined to a Date lookup table, as well as a "Daily Expenses" table. The "Daily Expenses" table stores a single value which represents the averaged cost of running a specific trading location per day.
It's extremely easy to produce a PivotTable that provides me with the total Sales Amount per store per [insert date period]. What I'm after is a DAX formulation that will calculate the profit per store per day - ie. Total Sales minus DailyExpenses (operating cost):
In theory, this should be pretty easy, but I'm confused about how to utilise DAX row context.
Failed attempts:
Profit:=CALCULATE(SUM(SalesInformation[SaleAmount] - DailyStoreExpenses[DailyExpense]))
Profit:=CALCULATE(SUM(SalesInformation[SaleAmount] - DailyStoreExpenses[DailyExpense]), DailyStoreExpenses[StoreLocation])
Profit:=SUM(SalesInformation[SaleAmount] - RELATED(DailyStoreExpenses[DailyExpense])
etc
Any help appreciated.
Zam, unfortunately your failed attempts are not close :-)
The answer, and best practice, is use a 4th table called 'Stores' which contains a unique record per store - not only is this useful for bringing together data from your two fact tables but it can contain additional info about the stores which you can use for alternative aggregations e.g. Format, Location etc.
You should create a relationship between each of the Sales and Expenses tables and the Store table and then use measures like:
[Sales] = SUM(SalesInformation[SaleAmount])
[Expenses] = SUM(DailyStoreExpenses[DailyExpense])
[Profit] = [Sales] - [Expenses]
Provided you have the Date and Store tables correctly linked to the two 'Fact' tables (ie Sales and Expenses) then the whole thing should line up nicely.
Edit:
If you want to roll this up into weeks, years etc. and you have no kind of relationship between expenses and the calendar then you'll need to adjust your expenses measure accordingly:
[Expenses] = SUM(DailyStoreExpenses[DailyExpense]) * COUNTROWS(DateTable)
This will basically take the number of days in that particular filter context and multiply the expenses by it.

Resources