Working with an OLAP Cube in Excel; determine monthly values within a larger date range? - excel

I've been working in an OLAP cube as a pivottable in excel for a while now, but more recently I'm trying to integrate some calculated measures to streamline some things, and I've hit a wall.
The cube has date filters set up that specify the month & year (they get more specific, but finer levels aren't used).
The rows will list individual projects, and the values can reflect 2 measures: one that reflects the average score for the date range, and another that reflects the number of observations the average is based on.
I would like to create a calculated measure that will display the average score for each project as long as 2 criteria are met:
There is a minimum of 100 observations for the whole date range
there are no months with zero observations across the date range specified in the pivottable.
I should also clarify that the date ranges I use will vary and length, and will not always end with the most recent month, but they will always be in increments of whole months.
I'm part of the way there, as a calculated measure based on this will provide me with the average only if there are enough observations for the date range:
IIF([Measures].[OBSERVATIONS]>=100,[Measures].[AVERAGE],"^^")
Now, I need to add the criteria of no months with zero observations.
I have attempted to use COUNT(), but in this form it ignores the date range set in the pivottable and returns a count of all of the months there is any value for the project, including zeros.
COUNT(([Calendar].[Calendar].[Month],[Measures].[OBSERVATIONS]),EXCLUDEEMPTY)
I tried determining the lowest number of observations in a month using this expression, but again it ignores the date range, and does not reflect empty cells:
MIN([Calendar].[Calendar].[Month],[Measures].[OBSERVATIONS])
I think CurrentMember is what I need to include, but I can't get it to work for me.
Assuming it's relevant, and I'm not sure the best way to explain the calendar hierarchy, so this reflects what I have:
The first calendar listing is the one used to filter the pivot table data, and it is also pulled into the mdx expressions above.
EDIT:
Thanks #SouravA for the reply. I tried a few things, and given that formatting in comments is limited, here's a rundown of what I did.
I'm getting an error message that says "Query(1,35) Parser: The syntax for 'WITH' is incorrect"
to make sure I'm using this correctly:
After making the changes below, I pasted the whole thing into the MDX: window from the 'calculated measures' tool in Excel.
I of course changed OBSERVATIONS and AVERAGE to the variable names in my cube.
I changed '[Project].[ProjectCode]' to '[Project].[ProjectName]', which is how my cube is set up.
On '[NewMeasure]' I changed it to the name I am using for the calculated measure.
on '[your cube]' I've tried a couple different things; the cube reference I use in the cube formulas in excel looks like this: 'Cubename NormativeCube', so I tried pasting that in the brackets with and without quotes, leaving the NormativeCube part off, and doing all that without the brackets.
I also modified the last line after 'WHERE' to reflect date ranges like this:
'[Calendar].[Calendar].[Month].&1&[2015]:[Calendar].[Calendar].[Month].&[12]&[2015]'
I've also set that as '[Calendar].[Month].[Month].&1:[Calendar].[Month].[Month].&[12]'
Also, one more question; will this work for any date range, or is it intended to be specified in the MDX? I need it to function based on the date range set by the cube filters.
EDIT 2:
I just needed to tweak this a tad by changing the '> 0' to '= 0', as the original solution only showed the average for those that did not have data for each month in the date range, and adjust one of the calendar set expressions or whatever.
IIF
(
[Measures].[OBSERVATIONS]>=100
AND
COUNT
(
FILTER
(
EXISTS
(
[Calendar].[Calendar].[Month].MEMBERS
,EXISTING [Calendar].[Calendar].[Month].MEMBERS
)
,[Measures].[OBSERVATIONS] = 0
)
) = 0
,[Measures].[AVERAGE]
,"^^"
)
EDIT 3:
Found a limitation;
the measure only works if the date range defined for the cube is in a whole increment of the calendar hierarchy; i.e. a single month, a single quarter, or a single year. doing 2 months, 2 quarters, 2 years, or breaking across 2 quarters or years will return the False outcome from the IIF() expression.
I played around with a few different ways to set it up, but can't get it to work.
EDIT 4:
Re: calendar hierarchy
Looking underneath [calendar].[calendar] there are 4 options: Year, Quarter, Month, & Date Key.
looking at the members under Year, Quarter, & Month, you can drill down all the way to the individual day.
The member properties under those 3 lists just the next level up the hierarchy
On Date Key, the member properties are as follows:
Month Name
Month of Year
Time Calcs (this doesn't do much to my knowledge)
Week of Year Week
EDIT 5:
So this is what worked (finally). I must have messed something up at some point, and editing the original formula caused the secondary issue I was having. Here's what worked for me.
IIF
(
[Measures].[OBSERVATIONS]>=100
AND
COUNT
(
FILTER
(
EXISTS
(
[Calendar].[Calendar].[Month].Members
,EXISTING [Calendar].[Calendar].Members
)
,[Measures].[OBSERVATIONS] = 0
)
) > 0
,[Measures].[AVERAGE]
,"^^"
)

Using the EXISTING clause comes in handy at times when you want your calculation to recognize a certain selection(context). Following code is self explanatory. Let me know if it works.
WITH SET ZeroObservationMonths AS
FILTER
(
EXISTS
(
[Calendar].[Calendar].[Month].MEMBERS
,EXISTING [Calendar].[Calendar].[Date].MEMBERS
)
,[Measures].[OBSERVATIONS] = 0
)
MEMBER Measures.NewMeasure AS
IIF
(
[Measures].[OBSERVATIONS]>=100 AND COUNT(ZeroObservationMonths) > 0
,[Measures].[AVERAGE]
,"^^"
)
SELECT [Project].[ProjectCode].MEMBERS ON 1,
Measures.[NewMeasure] ON 0
FROM [YourCube]
WHERE ({[Calendar].[Calendar].[Date].&D1: [Calendar].[Calendar].[Date].&D2})
EDIT:
If you're planning on creating the measure inside excel, just have the below MDX code in the text box for "New calculated measure"
IIF
(
[Measures].[OBSERVATIONS]>=100
AND
COUNT
(
FILTER
(
EXISTS
(
[Calendar].[Calendar].[Month].MEMBERS
,EXISTING [Calendar].[Calendar].[Date].MEMBERS
)
,[Measures].[OBSERVATIONS] = 0
)
) > 0
,[Measures].[AVERAGE]
,"^^"
)
EDIT 2: If the filtering can happen on any attribute, not just dates, replace EXISTING [Calendar].[Calendar].[Date].MEMBERS with EXISTING [Calendar].[Calendar].MEMBERS in the script above.

Related

DAX for rolling seven day average to pivot chart by year

I'm trying to create a single pivot chart that will show separate years of data on the same date axis for a rolling 7-day average.
So, the x-axis will be text, 01-Jan to 31-Dec, and each year will be a separate series:
It has to be a text x-axis, as 01-Jan will be a category containing data for 01-Jan-2018, 01-Jan-2019, 01-Jan-2020...
In theory, the pivot table setup would have the column (series) as the Year, and the x-Axis (labels) as the date label (column Date).
The values are then from the DAX expression that creates the rolling 7-day average.
The source data (tblSource) has a single column of dates (Date2) that rolls over across years and has the column Year to break it down in the pivot.
The daily value is the one that is averaged (itself and the previous six days).
The 7-day average I normally use in DAX doesn't work here.
I need to have the Date column actually in the pivot rather than Date2, as the axis needs to be text to allow for multiple date years on the same x-axis point, but I can't get a DAX formula to work.
The other consideration is that the formula can't just consider a single year, as the rolling seven day average for 01-Jan-2018 includes the previous six days of 2017, for example.
This is the formula I usually use, but I can't manage to tweak it!
AVERAGEX (
DATESINPERIOD ( tblSource[Date2], LASTDATE ( tblSource[Date2] ), -7, DAY ),
[Sum of Daily Value]
)
But this is the output I get, and nothing has been averaged. I think it's because Date2 is being pivoted off Date, but I'm not sure how I get around that?
Can anybody offer me any help?
It's quite a frustrating problem as it would be trivial for me to do it using code, or doing it manually, but I really am trying to get better at DAX!
Thanks in advance!
Phil.
Update: Thanks to Joao for this.
=VAR d = MAX(tblSource[Date2])
RETURN CALCULATE(AVERAGE(tblSource[Daily Value]),
ALL(tblSource[Date]),
DATESINPERIOD(tblSource[Date2], d, -7, day),tblSource[Year]>0)
I had to use MAX rather than SELECTVALUE as Excel seems to lack that functionality.
I also had to unfilter the Year, so that the rolling average could be calculated from the previous year's date where neccessary.
Thanks.
When you run that measure, your table is being filtered by the date text at each point, so when you pass/create the DATESINPERIOD filter, it creates a table with the last 7 dates, but only one of them is actually available (the one relevant to your current data point).
You need to clear the filters on the table so that you have all the dates available, in order for you average to work. You can achieve this by changing the measure slightly:
VAR d = SELECTEDVALUE(tblSource[Date2])
RETURN CALCULATE(AVERAGE(tblSource[Daily Value]),
ALL(tblSource[Date]),
DATESINPERIOD(tblSource[Date2], d, -7, day))

Calculate average based on a value column (count) in a pivot table

I'm looking a way to add an extra column in a pivot table that that averages the sum of the count for the months ("Count of records" column) within a time period that is selected (currently 2016 - one month, 2017 - full year, 2018 - 5 month). Every month would have the same number based on the year average, needs to be dynamically changing when selecting different period: full year or for example 4 months. I need the column within the pivot table, so it could be used for a future pivot chart.
I can't simply use average as all my records appear only once and I use Count to aggregate those numbers ("Count of records" column).
My current data looks like this:
The final result should look like this:
I assume that it somehow can be done with the help of "calculated filed" option but I couldn't make it work now.
Greatly appreciate any help!
Using the DataModel (built in to Excel 2013 and later) you can write really cool formulas inside PivotTables called Measures that can do this kind of thing. Take the example below:
As you can see, the Cust Count & Average field gives a count of transactions by month but also gives the average of those monthly readings for the subtotal lines (i.e. the 2017 Total and 2018 Total lines) using the below DAX formula:
=AVERAGEX(SUMMARIZE(Table1,[Customer (Month)],"x",COUNTA(Table1[Customer])),[x])
That just says "Summarize this table by count of the customer field by month, call the resulting summarization field 'x', and then give me the average of that field x".
Because DAX measures are executed within the context of the PivotTable, you get the count that you want for months, and you get the average that you want for the yearly subtotals.
Hard to explain, but demonstrates that DAX can certainly do this for you.
See my answer at the following link for an example of how to add data to the DataModel and how to subsequently write measures:
Using the Excel SMALL function with filtering criteria AND ignoring zeros
I also recommend grabbing yourself a book called Supercharge Excel when you learn to write DAX by Matt Allington, and perhaps even taking his awesome online course, because it covers this kind of thing very well, and will save you significant head-scratching compared to going it alone.

Cognos - Showing every month on x-axis when some months don't have values

Let me first say I am very new to Cognos and have mainly learned by just manipulating items within active reports. I am having an issue with creating a graph that acts like a time series. I want it to display every month (with multiple values in some months and none in others). I want to visually see gaps between data points (ex: we order products every 3 months starting in January, so we should see gaps in the months we do not order products - like February and March).
I have tried changing the label control to manual and setting display frequency to 1. However, I think my issue is that there is not any data within certain months.
You are correct in that your problem is lack of data. A standard inner join will drop rows where there is not a corresponding row in both tables, resulting in gaps.
There are two solutions available:
Use a union to create "dummy" records for each date
Manually specify an outer join between the date table and the table containing the rest of information
Since the first technique is the most common, I'll outline the basic steps for it here.
Create a new query
Add your month data item to the query
Create a 'dummy' data item for your measure. Use 0 for its expression.
If there is a date range filter in the main query apply it here
Create a union
Drag over your new query into the union
Drag over your original query into the union
Pull in the date and measure data items into the union query
Set the Aggregate Function property of the measure to Total
Use the union query as the source for your chart
For every month with measure data you will have two rows, one with the measure amount and one with 0. The two rows will be combined by the auto-group and summarize function. The measures will be added together. Anything added to 0 will end up as the original amount.
For months with no measure data, there will only be the 'dummy' row with 0 for the measure and it will be represented in your chart.

Spotfire: Select data from column based on criteria

I have a data table in Spotfire which contains two columns I'm interested in: Time (31/01/2015 for example), and Value (integer).
I want the most recent date (e.g. December 2015) to be set as the current time. Then I want to select Value based on previous 1 month, 3 months, 6 months etc. So if I want all the values for past 6 months it should Sum(Values) for Dec.2015, November.2015, October.2015, September.2015, August.2015 and return that.
So far I've only been able to accomplish this by manually performing the task in Excel before I insert it into Spotfire so is there any way to create a calculated column for each of the periods I want? (Past month, 3 months etc.)
There's likely a number of ways to solve this, but I'm going to give one suggestion and we'll see how it fits your specific case.
You can add a calculated column for each timespan you are interested in, defined like this:
Sum(if (DateAdd('month', 3, [Time]) >= Max([Time]), [Value], null))
This example would get you a column with all the values that have occured in the past 3 months, replace the number 3 in there to modify to the timespans you are interested in. A full sum of the calculated column would get you the total for that timespan.
Might be nicer to use a boolean column instead of duplicating the value column. Then your calculated columns would be defined as:
DateAdd('month', 3, [Time]) >= Max([Time])
When calculating totals you would then use an if statement using the calculated column, like this:
Sum(if([3Months],[Value],null))
where [3Months] is a boolean column.

Medians and slicers in DAX

I am having an issue that I am hoping some more experienced DAX programmers may be able to help me with. I have been trying to develop a dashboard in Excel 2013 / PowerPivot / PowerView and one of the graphics I would like to display is a line chart of median performance by hour of day. I would then like to filter the data set with my performance metrics based on a separate column, and link that to a slicer. The medians should be calculated relative to the filtered data set. For the median calculation I am trying to adapt the formula proposed by Marco Russo here (http://sqlblog.com/blogs/marco_russo/archive/2010/07/20/median-calculation-in-dax.aspx).
To illustrate the problem, suppose that I have two tables - main_table and other_table. Main_table has 4 columns- RowID, hour_of_day, performance_metric, and category. Other_table has two columns- hour_of_day and median_column. My goal is to find a formula for median_column such that it shows the median performance metric by hour of day, but can still be sliced by category. The formula I tried to use for the medians was
=CALCULATE(
MINX(
FILTER(
VALUES(main_table[performance_metric]),
CALCULATE(
COUNTA(main_table[performance_metric]),
main_table[performance_metric] <= EARLIER(main_table[performance_metric]))
> COUNTA(main_table[performance_metric]/2),
main_table[performance_metric]),
FILTER(
main_table,
main_table[hour_of_day] = EARLIER(other_table[hour_of_day])))
Or without formatting:
=CALCULATE(MINX(FILTER(VALUES(main_table[performance_metric]), CALCULATE(COUNTA(main_table[performance_metric]), main_table[performance_metric] <= EARLIER(main_table[performance_metric])) > COUNTA(main_table[performance_metric]/2), main_table[performance_metric]), FILTER(main_table, main_table[hour_of_day] = EARLIER(other_table[hour_of_day])))
However, when I create a slicer based on category in main_table, my chart does not seem affected by the slicer. My understanding was that by putting main_table as opposed to ALL(main_table) as the first argument in the last FILTER call, my median calculations would be subject to slices and filters applied to main_table. Am I missing something obvious here?
Calculated columns are apparently computed before queries are executed, therefore anything that needs to be affected by slicers must be entered as a measure, not a calculated column.
Answered in more detail here (http://www.mrexcel.com/forum/powerpivot-questions/741071-medians-context-issues-dax.html#post3641780)

Resources