I have data on objects which belong to different categories. I want to be able to compare the average across a selection of objects to averages across the categories where selected objects belong. I have written out measures, but they do not produce the expected results,
My data looks like this. I am using Power Pivot to set up the data model for MS Excel pivot charts.
Table 1 has unique stores (Store names are guaranteed to be unique)
Store
Branch
Region
Store A
North 1
Plain
Store B
North 1
Plain
Store K
West 3
Plain
Store F
West 3
Plain
Store T
East 1
Coast
Store P
East 1
Coast
Table 2
Store
Area, sq ft
Store A
3000
Store B
4000
Store K
2000
Store F
5000
Store T
5000
Store P
4000
Table 3
Store
Year
Month
Expenses
Store A
2022
September
10000
Store A
2022
October
15000
Store B
2022
September
20000
Store B
2022
October
22000
There is more than one year included in the dataset.
Table 2 and 3 are connected to table 1.
First, I write measures that I expect to compute costs per sq ft for selected objects.
Costs:=sum('Table 3'[Expenses)
Area:= sum('Table 2'[Area, sq ft])
Costs_per_sq_ft:= [Costs] / [Area]
Then, I write identical measures for averages:
Costs_avg:=average('Table 3'[Expenses)
Area_avg:= average('Table 2'[Area, sq ft])
Costs_per_sq_ft_avg:= [Costs_avg] / [Area_avg]
Finally, I write define measures to average across a selected group (assuming all selected elements belong to the same category):
Costs_avg_branch:=var StoreBranch = max('Table 1'[Branch] = StoreBranch) return calculate([Costs_avg], filter(all('Table 1');'Table 1'[Branch] = StoreBranch))
Area_avg_branch:=var StoreBranch = max('Table 1'[Branch] = StoreBranch) return calculate([Area_avg], filter(all('Table 1');'Table 1'[Branch] = StoreBranch))
Costs_per_sq_ft_avg_branch:=var StoreBranch = max('Table 1'[Branch] = StoreBranch) return calculate([Costs_per_sq_ft_avg], filter(all('Table 1');'Table 1'[Branch] = StoreBranch))
and identical measures for Region as variable,
On selection of Store A and September 2022, I expected to have
Costs_avg
Costs_avg_branch
Store A
3.33
4.28
i.e. the average for the selected store and the average for the branch where it belongs.
On selection of Store A and September-October I expected to have:
Costs_avg
Costs_avg_branch
Store A
4.17
4.78
( average over chosen period for a selected store and the same average for the branch).
On selection of the entire branch I intended the average across selection to match that of the category. E.g., for stores A, B in September-October 2022:
Costs_avg
Costs_avg_branch
North 1
4.78
4.78
Unfortunately, the averages for individual selected objects seem to be consistently near zero. When I select entire branches, the object average and the branch average do not match.
Is there any way to obtain the correct averages? Is it possible to get the desired behavior when objects from different categories are selected, as I originally wanted?
Related
I have a table like this:
Year Num Freq. Exam Grade Course
2014 102846 SM SM Astronomy 3
2015 102846 12,6 1,7 NC Astronomy 2
2017 102846 20 11,8 17 Astronomy 2
2015 102846 SM NC Defence Against the Dark Arts 4
2015 102846 11 4,5 NC Herbology 2
2015 102846 15 13,99 14 Herbology 2
I am trying to get the percentage of approved students (Grade >= 10) for each course by year and global average.
I've been trying for nearly 3 hours to do a calculated field but so far the only thing I could get was the sum of each student per year:
I have tried to do a calculated field with = Grade >= 10 hoping that it would give me a list of approved students but it gives me 1.
What am I doing wrong in here? It's my first time working with pivot tables.
I would really recommend to not mix string type (text) together with numbers. It's a horrifying idea and will cause a lot of headache when data will be used for calculations (both Freq. and Grade). Rather I would use 0 or some numeric value to represent the text.
Not recommended, but yes it's doable =)
You need some dummy variable to point out which row is number and which is text. So I created Grade Type. We can now count only the rows that have a number in the Grade column by using Grade Type = Number.
I create a table of the data and add the column Grade Type. I use this formula to get Grade Type:
=IF(ISNUMBER([#Grade]),"Number","Text")
I then create the following measures:
Nr of Approved Students
=COUNTX(FILTER(Table1, Table1[Grade Type]="Number"),
IF((VALUE(Table1[Grade])>=10),VALUE(Table1[Grade]),BLANK()))
First we filter which rows that should be evaluated (COUNTX(<table>,...)). If yes, then only count for rows that fulfill >=10, where VALUE() converts string number to numeric (COUNTX(...,<expression>)).
Nr of Student (w/ Grade Number)
=COUNTX(FILTER(Table1, Table1[Grade Type]="Number"), VALUE(Table1[Grade]))
Count all rows that have a number
Approved (% of Total)
=[Nr of Approved Students]/[Count of Grade]
Setup the PowerPivot Table
Create the PowerPivot and add the data to the data Model
Then create a new measure by clicking your pivot table and then "Measures" -> "New Measure..."
Fill in all the relevant data.
Result should be something like:
I'm trying to create a top3 ranking from a data table varying metrics but each time I get the wrong order from the cuberankedmember, usually misplacing ranks 2 and 3.
The data I'm mostly focused on is regarding sales revenue. Power pivot sums all sales by store, quite straight forward.
From this I use a cubeset formula that captures store name, filtered by a month and year, which the user types in as any day for the month, and set the measure which to sort by (NTS) (code 1).
The cuberankedmember selects the cubeset and defines the position (code 2).
Then the cubevalue selects as members the cuberankedmember, filters once again month and year, then pulls in the measure (code 3).
E4 is the date
Code1 (cell C21):
=CUBESET("ThisWorkbookDataModel";
"NONEMPTY([Store_Dict].[Nome_DSR].children,
([Calendar].[Year].[All].["&YEAR($E$4)&"],
[Calendar].[Month Number].[All].["&MONTH($E$4)&"]))";
"Ranking";
2;
"[Measures].[NTS]")`
`Code2` (cell `D22`):
`=CUBERANKEDMEMBER("ThisWorkbookDataModel";$C$21;1;"a")
`C21` is the `CUBESET` formula
Code3:
CUBEVALUE("ThisWorkbookDataModel";
$D22;
"[Calendar].[Month Number].["&MONTH($E$4)&"]";
"[Calendar].[Year].["&YEAR($E$4)&"]";
"[Measures].[NTS]")
Actual Result:
Ranking Store NTS
1 a 606
2 c 425
3 b 428
Expected result:
Ranking Store NTS
1 a 606
2 b 428
3 c 425
My data set contains house price for 4 different house types (A,B,C,D) in 4 different countries (USA, Germany, Uk, sweden). House price can be only three types (Upward, Downward, and Not Changed). I want to calculate Difition index (ID) for different House types (A,B,C,D) for different countries (USA, Germany, Uk, sweden) based on house price.
The formula that I want to use to calculate Difition index (DI) is:
DI = (Total Number of Upward * 1 + Total Number of Downward * 0 + Total Number of Not Changed * 0.5) / (Total Number of Upward + Total Number of Downward + Total Number of Not Changed)
Here is my data:
and the expected result is:
I really need your help.
Thanks.
You can do this by using groupby and assuming your file is named as text.xlsx
df = pd.read_excel('test.xlsx')
df = df.replace({'Upward':1,'Downward':0,'Notchanged':0.5})
df.groupby('Country').mean().reset_index()
By reducing I mean some form of caching when you can reduce 100 rows with 1 row (accumulate counters etc).
I want to able to answer queries how many people are from %EACH_COUNTRY, so basically return an array of pairs / map from (Country, COUNT). And then I've got huge number (think of 50 * 10^8) of people so I can't allocate 1 row for each person, so I'd like to cache the results somehow to keep PeopleTable under 10^6 entries at least (and then merge the results with the fast read from CacheTable). By caching I mean count the number of people with country=%SPECIFIC_COUNTRY, write %SPECIFIC_COUNTRY, COUNT(*) in CacheTable (to be precise, increment the count for %SPECIFIC_COUNTRY and then remove these rows from PeopleTable):
personId, country
1132312312, Russia
2344333333, the USA
1344111112, France
1133555555, Russia
1132666666, Russia
3334124124, Russia
....
and then
CacheTable
country, count
Russia, 4
France, 1
I'm trying to create a data model in which there are sales people who sell a variety of different product's. The problem comes in with the Tier structure for each product. Some products will receive different points according to sales about. some may have two to three tiers of points depending on sales amount. Other product may just be a flat payout. the then end the sales person gets his finally bounds as a percentage of his points depending on the Tier of number of points he receives for example
Product 1
if volume 100 = 10 points
if volume 200 = 20 points
if volume 300+ = 30 points
employee payout
100 points = 20% of points payout
200 points = 50% of points payout
300 points = 150% if points payout.
I'm not sure how to structure this in the data model and calculate with DAX formula
Thanks for the help in advance
Create new calculated column
Lets Say,
Now you will have
Volume calculated column
(IF ( Volume>=100 then 10 Volume >= 200 then 20)
Person 1 Product 1 100
Person 2 Product 2 200
Person X Product X 300
Then add one more calculated column based on this calculated column to get percentage of volume.
Mark answer as correct if it helps.
Try the following approach:
Data structure
Products:
Sales:
Data model
Load both tables into the Data Model (I called them Products and Sales)
In the diagram view, create a relationship between Sales[Product] and Product[Product]
DAX
This is the ugly part: In the sales table, as a new calculated column with the name Points. Use this DAX formula:
=IF(Sales[Volume]<RELATED(Products[Volume Tier 1]),0,
IF(Sales[Volume]<RELATED(Products[Volume Tier 2]),RELATED(Products[Points Tier 1]),
IF(Sales[Volume]<RELATED(Products[Volume Tier 3]),RELATED(Products[Points Tier 2]),
IF(Sales[Volume]<RELATED(Products[Volume Tier 4]),RELATED(Products[Points Tier 3]),
IF(Sales[Volume]<RELATED(Products[Volume Tier 5]),RELATED(Products[Points Tier 4]),
IF(Sales[Volume]>=RELATED(Products[Volume Tier 5]),RELATED(Products[Points Tier 5])))))))
Add a new measure with this formula: TotalPoints:=SUM(Sales[Points])
Now you can determine the number of points per transaction/sales person/etc. and use this in the subsequent steps.
Instead of using the really Volume Tiers, you could also leave non-relevant tiers blank in the Product table and extend your formula using the ISBLANK function.
I don't know about DAX but this will handle the Excel formulae.
Assuming volume in column A, to calculate points in column B:
$B2 = MIN(10*INT($A2/100),30)
Then I'm assuming you are going to aggregate points somewhere else (let's say in column D) and calculate payout in column E. My preferred way of doing this is to create a small lookup table somewhere. It looks like this:
Points Payout Rate
0 0
100 0.2
200 0.5
300 1.5
Give the lookup table a name, e.g. PayoutRates. The formula to look up the payout rate, and calculate the payout is:
=$D2 * VLOOKUP($D2,PayoutRates,2,TRUE)
Alternatively, you can use nested IF statements to get the same result:
=$D2 * IF($D2<100,0,IF($D2<200,0.2,IF($D2<300,0.5,1.5)))