PivotTable Average of group columns - excel

I wonder if it's possible to create a table like this:
I have calculated the equity ratio for two companies shown in the column "Calculated field 1". Now I would like to create the average and the minimum value of this column for each company! as shown in the table (red numbers).
For clarification, the column C, row C5 to C26 show the average for company 1.
The row C28 to C49 show the average for company 2.
Any ideas how to proceed this?

Average:=
AVERAGEX(
ALL(Datasrc[Year])
,[Calculated field 1] // Seriously? Come up with a better name.
)
Minimum:=
MINX(
ALL(Datasrc[Year])
,[Calculated field 1]
)
*X() functions take a table expression as their first argument. They step row by row through this table, evaluating the expression in the second argument for each row, and accumulate it. AVERAGEX() accumulates with an average, MINX() with a minimum.
ALL() returns unique elements of a table, column, or set of columns stripped of context. Since we are only calling ALL() on the [Year] column, that is the only column whose context we remove. The context from [Stock Ticker] remains in place.
These two measures are unreasonable in the grand total level.
Ninja edit: There's likely a better way to write the measures, but with no insight into what [Calculated field 1] is doing, I can't make suggestions toward that. *X() functions force single threaded evaluation in the formula engine. It's not a bad thing, but you take a performance hit over queries which can utilize only the storage engine. If you could rewrite it to a CALCULATE(SUM(),ALL(Datasrc[Year])) / ALL(Datasrc[Year]), you'd get faster evaluation for sure, but I have no clue what your data looks like.

Related

Excel Matching Customer Orders by Item and Quantity

Brief:
I have a large dataset, inside of which are Individual customer orders by item and quantity. What I'm trying to do is get excel to tell me which order numbers contain exact matches (in terms of items and quantities) to each other. Ideally, I'd like to have a tolerance of say 80% accuracy which I can flex to purpose but I'll take anything to get me off the ground.
Existing Solution:
At the moment, I've used concatenation to pair item with quantity, pivoted and then put the order references as column and concat as rows with quantity as data (sorted by quantity desc) and I'm visually scrolling across/down to find matches and then manually stripping in my main data where necessary. I have about 2,500 columns to check so was hoping I could find a more suitable solution with excel doing the legwork on identification.
Index/matching works at cross referencing a match for the concatenation but of course, the order numbers (which are unique) are different so its not giving me matches ACROSS orders.
Fingers crossed!
EDIT:
Data set with outcomes
As you can see, the bottom order reference has no correlation to the orders above it so is not listed as a match to the orders above but 3 are identical and 1 has a slightly different item but MOSTLY matches.

DAX to handle cycles of different months

I have 2 tables.
First table: dimensional table to show available units of cars at start of selling cycle and for how long these units will be available.
Second table: to show how many cars were sold on a given month within their "available cycle".
I'd like to compare the "selling behaviour" within each cycle. Thus, I want to display the total initial units available next to the units sold at each stage within the cycle. The second dimension works fine, but not the first one.
This is what I get:
And this the desired output (note rows 4 and 5 for available_units)
I tried the below DAX code without success:
SumAvailableUnits:=CALCULATE(SUM([available_units]),FILTER(ALL(Table1[month_within_cycle]),[month_within_cycle]>=MAX([months_available])))
First, DAX Formatter is your friend. You may like writing unreadable single line measures, but no one else likes reading them.
I've also taken the liberty of cleaning up your table names and adding fully qualified column references. (Ignoring that your dimension isn't a pure dimension, as it holds numeric values that you aggregate in a measure)
SumAvailableUnits :=
CALCULATE (
SUM ( DimCar[available_units] ),
FILTER (
ALL ( FactSale[month_within_cycle] ),
FactSale[month_within_cycle] >= MAX ( DimCar[months_available] )
)
)
And immediately we see a problem. With the fully qualified column references, it is clear that you're trying to filter the lookup table (the one side) by the base table (the many side). In Power Pivot for Excel, we do not have bi-directional relationships (though they're available in Power BI and coming for Excel 2016). Our relationships' filter context only flows from the lookup table to the base table, typically from the dimension to the fact.
Additionally, your DimCar, by holding [available_units] and [months_available] encodes an implicit assumption that a specific [car_id] can only ever refer to a single, unchanging lot. You will never see another row with [car_id] = 1. This strikes me as highly unlikely. Even if it is the case the better solution is a model change.
In general, anything that goes onto a row or column label should come from a dimension, not a fact. Similarly, anything you're aggregating in a measure should live in a fact table. The usual exception is dimension counts - either bare, or as a denominator in a measure. Following these will get you 80% of the way in terms of dimensional modeling. You can see the tables and model diagram I've ended up with in the image below.
Here are the measure definitions
SumAvailableUnits:=SUM( FactAvailability[available_units] )
SumSold:=SUM( FactSale[cars_sold] )
Here are my source tables, my model diagram with relationships, and a pivot table built from these pieces and the measures above. Note the source of [month_within_cycle] in the pivot.
Finally, you might notice that my grand total behaves in a different way than in your original. Since the values are repeated monthly, we get a much larger grand total. If you need to instead end with the sum from the latest month (which it looks like you have in your sample), you can use an alternate measure, provided below. I don't understand why this would be your desired grand total, but you can achieve it fairly easily. Personally, I'd probably blank the measure at the grand total level.
SumAvailableUnits - GrandTotal:=
SUMX(
TOPN(
1
,FactAvailability
,FactAvailability[month_within_cycle]
,0
)
,FactAvailability[available_units]
)
This uses SUMX() to step through the table provided, defined by TOPN(). TOPN() returns the first row (in this case) in FactAvailability, including ties, after sorting by [month_within_cycle], out of all rows available in the filter context. In the context of a specific month, you get all the rows associated with that month - identical to the simple sum. In the grand total context, you get the rows associated with the last month.
SUMX() iterates over that table and accumulates the values of [available_units] in a sum.

Creating "Categories" to show on a PivotTable

I have a student database, and I'm trying to show different metrics based on a student's score range in a PivotTable. Specifically (this is a simplified example, so don't worry about the content) I want to show this in my pivot:
StudentGPACat | Avg Post-Grad Salary
3-3.2 | 64,323
3.2-3.4 | 71,225
3.4-3.6 | etc
3.6-3.8 | etc
3.8-4.0 | etc
So I want the rows in my pivot table to show the range the student's average score falls in.
In order to generate that metric, right now, I did 2 things:
(1) Added a new column in my master table in PowerPivot called [avgGrade] that shows the value of the [TableAvgGrade] calculated field from the "Grades" table for each student (i.e., each row in the master table)
=CALCULATE([TableAvgGrade],
FILTER(Grades,Grades[studentID]=Master[studentID]))
(2) Created a new column [StudentGPACat] in PowerPivot and the formula goes:
=If([avgGrade]<3,"3",
If([avgGrade]<3.2,"3-3.2",
If([avgGrade]<3.4,"3.2-3.4",
If([avgGrade]<3.6,"3.4-3.6",
If([avgGrade]<3.8,"3.6-3.8","3.8-4.0")))))
This feels bulky and computationally expensive. Is there an easier way to create these ranges to use as rows in my PivotTable?
EDIT: made some edits to clarify my question
EDIT2: type
What you've done is the appropriate pattern for creating this sort of column. If you're concerned about the gnarly nested IF()s, you can replace with a SWITCH(), which is just syntactic sugar for nested IF()s, but what you've posted is all you need.
In a PivotTable (I don't know with PowerPivot), if you use a numeric value as a Row Label, you can Right click the field, choose Group, define the Starting at value, Ending at value and By step, and you will get an equivalent result quite easily.

How to get Excel Pivot 'Summarize by' to return actual data values not sum, avg, etc?

I am using Excel Powerpivot with data in two separate tables. Table 1 has ITEM level data with a BRAND characteristic. Table 2 has BRAND level data. The two tables are linked by the BRAND key. The measure I am using is non addable. i.e. the sum of the ITEMS does not equal the BRAND. The pivot is set up with ITEMS nested under BRANDS in the rows and the Measure in the column.
Excel assumes that I want to summarize ITEM to a BRAND level by applying SUM, MAX, MIN, AVG, etc. I would like to return the actual values from the appropriate ITEM or BRAND level table and not apply any calculations to the values. Is this possible?
If what you are effectively trying to do is produce a different result for the Brand rows (e.g. blank()) then the answer is to write a further measure that does a logic check to determine whether or not the row in question is an ITEM or a BRAND.
= IF (HASONEVALUE(table1[Item]), [Measure], Blank() )
Bear in mind that this will work for your current pivot but may not be adaptable to all pivots.
This assumes that you have explicitly created a measure called [Measure] and you are not just dragging the numeric column into the values box. If not you can create the initial [Measure] something like this:
= Sum(table1[Value])
Where Value is the column you want to use in the measure. Although you have used a sum, if it relates to a single item which has a single row in the table it will give the desired result.

Report on users who don't estimate well in Excel

I have a spreadsheet corresponding to entries of a user, their estimation, and the actual value (for example: hours for a particular project - again, this is only an example), which we can represent in CSV like:
User,Estimate,Actual
"User 1",5,5
"User 1",7,7
"User 2",3,3
"User 2",9,8
"User 3",6,7
"User 3",8,7
I'm trying to build a report on these users, to quickly see which users underestimate or overestimate, and so I created a pivot table. But, I can't figure out how to simply show if a user has underestimated at some point. I tried to create a calculated field like =IF(Estimate > Actual, 1, 0), but this sums, then compares the Estimate and Actual columns and tells me that "User 3" doesn't over/underestimate.
Without adding an additional field to my data, how can I accomplish this?
A similar SQL pseudo-query would be:
SELECT DISTINCT al.User,
(SELECT COUNT(*) FROM ActivityLog AS l2 WHERE l2.User = al.User AND l2.Estimate > l2.Actual) AS Overestimates
FROM ActivityLog AS al
Edit:
I'm still working on this, and currently have created a static list of users in some cells on the side, and have given them the Array Formulas: {=SUM(IF((A$2:A20 = F6)*(B$2:B20 > C$2:C20), 1, 0))} and {=SUM(IF((A$2:A20 = F6)*(B$2:B20 < C$2:C20), 1, 0))} (if I have the user's name in F6).
Mainly, I want to do this where the list of users can populate dynamically from the main data.
Calculated fields in pivot tables stink. I would get rid of the pivot table and do it with formulas. Start a unique list of users in H15 and enter this in I15
{=MAX(($A$2:$A$7=H16)*($B$2:$B$7-$C$2:$C$7<>0))}
array entered. This will return 1 if they ever over or under estimated and zero if they never did. The downside is that you can't "refresh" it like a pivot table so you have to make sure your unique user list is accurate all the time.
If that's too big of a downside, I think you'll need to add a column to your source data. Specifically
=ABS(B2-C2)
And add that to your pivot table. It will show zero for never over/under and non-zero otherwise.
You are aware that you should make sure the estimates are all in the same range? Smaller numbers can be estimated better (when talking about hours).
Add a column for actual-estimate
then summarize those values for min max and average. (or stddev)

Resources