Debating: Average of the Average vs Average - statistics

I know this is an old question. You will probably conclude that Average of the Average is always wrong. Consider the following example:
You want to know the purchasing behaviour for a supermarket by understanding the share% of the baskeket. For each order, you can have a share% across product categories. The dataset can be like this:
order_id, grocery%, tabacco%, cloth%, etc. The share% is based on the order amount. Each row is a unique order_id.
If you are summing up all grocery amount and divided by total order amount, you can indeed get the average grocery share. If given more contexts, let's say, the VIP in this supermarket accounts for 10% and each order they can spend 1 million (just assumption). So it is quite possible that the result tends to be close to the VIP result.
If I am more interested in the average player behaviour, it seems to use the average of the average metric, which is this one: (grocery% + grocery% + ...)/order number.
Any thoughts?

So let me try to answer your question with an example.
Let us say, there were only three purchases made in the supermarket.
Purchase 1
Grocery Amount = 30$ (60%)
Cloth Amount = 20$ (40%)
Purchase 2
Grocery Amount = 10$ (50%)
Cloth Amount = 10$ (50%)
Purchase 3
Grocery Amount = 5$ (25%)
Cloth Amount = 15$ (75%)
Now let us calculate our metrics:
Approach "Average of Average"
Final Answer = (25% + 50% + 60%)/3 = 45%
Approach "Average"
Final Answer = (5$ + 10$ + 30$)*100/140$ = 32.14%
Conclusion
Given the example above, obviously, the "average" approach leads to a more accurate result. But given your use case, you can use any of these.
Hope this helps!

Related

Shopware 6 and counting of VAT

Can anyone explain how Shopware counts VAT in total price for Cart?
I have a cart with one product that costs 1.309 euro. Tax rate is 19%, so tax value is 0.24871. This I can understand: 1.309/100*19=0,24871.
But then it adds shipping cost (2 euro). It somehow makes 3.3074732824427 euro (don't konow how), but even more strange for me that 19% from this amount makes 0.56871. How they are calculating this? 3.307/10019 is about 0.63, and 3.307/11919 is about 0.53, but not 0.57.
Also, is there any way to change this algorithm programmatically? I use in my Calculator $this->taxCalculator->calculateNetTaxes() for every product, but it doesn't count total sum. And method $toCalculate->setPrice() in my CartProcessor doesn't work.
If 1.309 is a gross price, it means that tax is included and in fact it is 119% (net prices(100%) + VAT(19%)).
So VAT of product is (1.309/119)*19 = 0.209
VAT of shipping is (2/119)*19 = 0.319.... (by same logic)
and at the end 0.209+0.319 = 0.528

DAX. Problem with subtotals and grand totals

hope you are doing well and can help solve this puzzle in DAX for PowerBI and PowerPivot.
I'm having troubles with my measure in the subtotals and grand totals. My scene is the following:
I have 3 tables (I share a link below with a test file so you can see it and work there :robothappy:):
1) "Data" (where every register is a sold ticket from a bus company);
2) "Km" (where I have every possible track that the bus can do with their respective kilometer). Related to "Data";
3) and a "Calendar". Related to "Data".
In "Data" I have all the tickets sold from a period with their price, the track that the passenger bought and the departure time of that track.
Each track can have more than 1 departure time (we can call it a service) but only have a specific lenght in kilometers (their kilometers are specified in the "Km" table). 
Basically what I need is to calculate the revenue per kilometer for each service in a period (year, month, day).
The calculation should be, basically:
Sum of [Price] (each ticket sold in the period) / Sum of [Km] (of the period considerating the services with their respective kilometers)
I managed to calculate it for the day granularity with the following logic and measures:
Revenue = SUM(Data[Price])
Unique dates = DISTINCTCOUNT(Data[Date])
Revenue/Km = DIVIDE([Revenue]; SUM(Km[Km])*[Unique dates]; 0)
I created [Unique dates] to calculate it because I tried to managed the subtotals of track granularity taking into account that you can have more than 1 day with services within the period. For example:
For "Track 1" we have registered:
1 service on monday (lunes) at 5:00am.
Revenue = $1.140.
Km = 115.
Tickets = 6.
Revenue/Km = 1.140/115 = 9,91.
1 service on tuesday (martes) at 5:00am.
Revenue = $67.
Km = 115.
Tickets = 2.
Revenue/Km = 67/115 = 0,58.
"Subtotal Track 1" should be:
Revenue = 1.140 + 67 = 1.207.
Km = 115 + 115 = 230.
Tickets = 6 + 2 = 8.
Revenue/Km = 1.207/230 = 5,25.
So at that instance someone can think my formula worked, but the problem you can see it when I have more than 1 service per day, for example for Track 3. And also this impact in the grand total of march (marzo).
I understand that the problem is to calculate the correct kilometers for each track in each period. If you check the column "Sum[Km]" is also wrong.
Here is a table (excel file to download - tab "Goal") with the values that should appear: 
[goal] https://drive.google.com/file/d/1PMrc-IUnTz0354Ko6q3ZvkxEcnns1RFM/view?usp=sharing
[pbix sample file] https://drive.google.com/file/d/14NBM9a_Frib55fvL-2ybVMhxGXN5Vkf-/view?usp=sharing
Hope you can understand my problem. If you need more details please let me know.
Thank you very much in advance!!!
Andy.-
Delete "Sum of Km" - you should always write DAX measures instead.
Create a new measure for the km traveled:
Total Km =
SUMX (
SUMMARIZE (
Data,
Data[Track],
Data[Date],
Data[Time],
"Total_km", DISTINCT ( Data[Kilometers Column] )
),
[Total_km]
)
Then, change [Revenue/Km] measure:
Revenue/Km = DIVIDE([Revenue], [Total Km])
Result:
The measure correctly calculates km on both subtotal and total levels.
The way it works:
First, we use SUMMARIZE to group records by trips (where trip is a unique combination of track, date and time). Then, we add a column to the summary that contains km for each trip. Finally, we use SUMX to iterate the summary record by record, and sum up trip distances.
The solution should work, although I would recommend to give more thoughts to the data model design. You need to build a better star schema, or DAX will continue to be challenging. For example, I'd consider adding something like "Trip Id" to each record - it will be much easier to iterate over such ids instead of grouping records all the time. Also, more descriptive names can help make DAX clean (names like km[km] look a bit strange :)

How to Calculate Loan Balance at Any Given Point In Time Without Use of a Table in Excel

I'm trying to calculate the remaining balance of a home loan at any point in time for multiple home loans.
Its looks like it is not possible to find the home loan balance w/ out creating one of those long tables (example). Finding the future balance for multiple home loans would require setting up a table for ea. home (in this case, 25).
With a table, when you want to look at the balance after a certain amount of payments have been made for the home loan, you would just visually scan the table for that period...
But is there any single formula which shows the remaining loan balance by just changing the "time" variable? (# of years/mths in the future)...
An example of the information I'm trying to find is "what would be the remaining balance on a home loan with the following criteria after 10 years":
original loan amt: $100K
term: 30-yr
rate: 5%
mthly pmts: $536.82
pmts per yr: 12
I'd hate to have to create 25 different amortization schedules - a lot of copy-paste-dragging...
Thanks in advance!
You're looking for =FV(), or "future value).
The function needs 5 inputs, as follows:
=FV(rate, nper, pmt, pv, type)
Where:
rate = interest rate for the period of interest. In this case, you are making payments and compounding interest monthly, so your interest rate would be 0.05/12 = 0.00417
nper = the number of periods elapsed. This is your 'time' variable, in this case, number of months elapsed.
pmt = the payment in each period. in your case $536.82.
pv = the 'present value', in this case the principle of the loan at the start, or -100,000. Note that for a debt example, you can use a negative value here.
type = Whether payments are made at the beginning (1) or end (0) of the period.
In your example, to calculate the principle after 10 years, you could use:
=FV(0.05/12,10*12,536.82,-100000,0)
Which produces:
=81,342.32
For a loan this size, you would have $81,342.32 left to pay off after 10 years.
I don't like to post answer when there already exist a brilliant answer, but I want to give some views. Understanding why the formula works and why you should use FV as P.J correctly states!
They use PV in the example and you can always double-check Present Value (PV) vs Future Value (FV), why?
Because they are linked to each other.
FV is the compounded value of PV.
PV is the discounted value at interest rate of FV.
Which can be illustrated in this graph, source link:
In the example below, where I replicated the way the example calculate PV (Column E the example from excel-easy, Loan Amortization Schedule) and in Column F we use Excel's build in function PV. You want to know the other way... therefore FV Column J.
Since they are linked they need to give the same Cash Flows over time (bit more tricky if the period/interest rate is not constant over time)!!
And they indeed do:
Payment number is the number of periods you want to look at (10 year * 12 payments per year = 120, yellow cells).
PV function is composed by:
rate: discount rate per period
nper: total amount of periods left. (total periods - current period), (12*30-120)
pmt: the fixed amount paid every month
FV: is the value of the loan in the future at end after 360 periods (after 30 year * 12 payments per year). A future value of a loan at the end is always 0.
Type: when payments occur in the year, usually calculated at the end.
PV: 0.05/12, (12*30)-120, 536.82 ,0 , 0 = 81 342.06
=
FV: 0.05/12, 120, 536.82 , 100 000.00 , 0 = -81 342.06

Excel Formula, To Calcuate a maximum Weight based off a desired minimum profit (GP%)

So I am working on a spreadsheet for a Butchery I manage and have run into a problem.
First off back story: We do $20 packs for certain bulk products that have a min/max weight range.
The Goal is to be able to put in this spreadsheet the desired minimum GP% and from that get a maximum weight based off that minimum profit margin.
For example a Beef Steak that Costs $17.50 p/kilo Would be minimum of 680g (at a GP% of 30.30%) and a maximum weight of 790g (at a GP% of 20.50%)
I have been 'googling' all day, and banging my head on my desk (as well as experimenting with different formula's) I am starting to think I may have to resort to programming a macro to perform this but I would prefer to be able to achieve in a formula on the cell that way I can copy-paste easily down the spreadsheet.
If anyone has a solution or can put me on the right track would be Awesome.
I think the formula you are looking for is :
your selling price (=20$) / your mark up on cost
where your mark up is :
your cost per kilo / (1- your margin)
So for 20% expected GP it gives :
= 20 / (17.5 / (1-0.2))
= 20 / 21.875
= 0.914... kilos
Balance is then :
Revenue = 20$
Cost = 0.914 * 17.5 = 16
Margin = 4
Margin % = 20

DAX measure to iterate each row for correct division (for total as well)

I am not sure if there's a way in DAX to create a measure that would help me with the following:
Calculate the efficiency by day
Display the total efficiency in a pivot table / PowerBI matrix as the overall total and not as sum of the daily efficiency
Here's a simple example:
Where:
Total Categories = Category1 + Category2 + Category3
Efficiency = (Total Categories + Category4*0.33)/Category4
At first I've created measures for each category (e.g. TotalCateg1 = SUM[Category1] etc.) and hopping to get the right result in the end. My problem is I am not able to get both the daily efficiency and the total right. Is there a way around it?
For Total Categories use this formula:
=SUM([Category 1])+SUM([Category 2])+SUM([Category 3]) .
then for the Efficiency use this formula:
=([Total Categories]+SUM([Category 4])*0.33)/SUM([Category 4])

Resources