DAX / PowerPivot query functions to spread aggregated values over time period - excel-formula

I’m trying to work out the DAX expression [for MS PowerPivot in Excel 2010] to evenly distribute the sum of a value across the range it’s applied to, and re-sum that up for a given time span/period. It’s trivial to cross-apply in SQL server, though with every attempt, end up with the same wrong result.
I’ve got the output from MS Project saved as Excel and imported/transformed using PowerQuery, so the start and finish/end dates are proper dates, the {X}h and {Y}d are integers, and the calendar day duration between them are already calculated/formatted for the model. I also have a dates table created that has the contiguous dates from the first date through the last, and a years table that has the string representation of the 4 digit years I want to summarize by.
The model looks like so:
I have created calculated columns on the ResourceQuery, TaskQuery and AssignmentQuery tables (all directly taken from the MS Project output), and on the ServiceAreaQuery (unique values from TaskQuery … essentially subprojects). Each also has a simple measure that is the sum of the Assigned hours column.
The data itself looks like you’d expect from a Project 2010 file, and has a {start_date}, {finish_date} and hours. The dates for a task can span anywhere from 1 day to 5 years … and this is where my problem lies.
How do I split/chunk the pre-summed value for a long running tasks to match the time interval I’m looking for?
Even if I use the year column from the date table, the time intelligence doesn’t catch it & I’m running out of ideas for a CALCULATE(SUM(FILTER(COUNTROWS(DATESBETWEEN)))) type of thing.
There are two intermediate steps I've tried to figure out to no avail. I’d imagine they are both solved by the same effective function to get to the end goal of hours, by service area, by resource, by year.
Pivot table to show
Hours by resource, by year
Hours by service area, by year
in order to show the end goal of
Hours, by service area, by resource, by year
You can see the issue in the output below.
Note that when using the total of assigned hours, and the resource name from AssignmentQuery, I get the right sums, but when using any date value … I only get the hours against the start date (the active relationship in the model). What I need is for those hours to be evenly spread across the period that they’re applicable to (so if something has 1,000 hours between 1/1/16 and 1/1/19 I’d expect 333 hours/year to show).
My initial thought is that the selector/filter/calculate function needs to do the following:
Select the hours for the person
Select the days in the period filtered to (e.g. month/year/quarter/whatever) from either a filter or as a column header
Calculate the hours per day
Get the working days in the filtered period
select the sum of the hours from the overlap
Any ideas are greatly appreciated! I’m open to doing some additional ETL/data creation as a PowerQuery step, but would really like to figure out the right DAX expression for this so it can be something available as a time-slicer/filter on the project.
Thanks in advance.
** Edit to post the revised version of the answer provided **
[Hours Apportioned Raw] :=
DIVIDE (
CALCULATE (
[Hours],
FILTER (
AssignmentQuery,
AssignmentQuery[Start_Date] <= MAX ( Dates[Date] )
&& AssignmentQuery[Finish_Date] >= MAX ( Dates[Date] )
)
)
, ( COUNTROWS (
DATESBETWEEN (
Dates[Date]
, FIRSTDATE ( AssignmentQuery[Start_Date] )
, LASTDATE ( AssignmentQuery[Finish_Date] )
)
)
)
)

Given that you have a relatively complex model in place and your requirement is not totally straightforward, I'm not sure that this will get you all the way there but hopefully it will at least either give you the inspiration to modify it for your purposes or start a more detailed discussion.
The measures below effectively sum the hours, apply them to dates where the dates are between the start and end and divides the total by the number of days. The slight complexity is this needs to be iterated x2 - once over dates and once over rows in the table containing the hours.
An issue for you might be that I'm using an unconnected date table and if you can't replicate this situation in your model then we will need to try using some ALL() functions instead.
Solution below assumes a table called 'data' that has 4 columns: id, start, end, value and table called calendar that has 2 columns Date and Month.
Measure 1: Sum the hours
[Hours] =SUM(Data[Value])
Measure 2: Apply the hours to the dates and divide by number of dates
[Hours Apportioned Raw] =
CALCULATE ([Hours],
FILTER (
Data,
Data[Start] <= MAX ( Calendar[Date] )
&& Data[END] >= MAX ( Calendar[Date] )
)
)
/ ( MAX ( Data[End] ) - MAX ( Data[Start] ) )
Measure 3: Iterate Measure 2 over dates and ids to give correct values
=
SUMX (
VALUES ( Calendar[Date] ),
SUMX ( VALUES ( Data[ID] ), [Hours Apportioned RAW] )
)
Hope this makes some sense, very simple test model here: Test Model
Note you will need to download the model not just view it in the browser.

Related

DAX - referencing row header in a formula

I need your help in creating a DAX measure that would help me resolving the following problem:
I have a table that shows case number, creation date and closure date:
case_number creation_date closure_date
CA001 4/15/2020 4/19/2020
CA002 4/17/2020 4/20/2020
CA003 4/19/2020 4/21/2020
CA004 4/19/2020 4/20/2020
I have a pivot with various measures where row headers are consecutive days from a related Calendar table. One of the columns I need is a number of active cases in given day (number of cases that were created before or the same day as the context date, AND closed after the context date). The expected outcome would be like this:
date open_cases
4/13/2020 0
4/14/2020 0
4/15/2020 1
4/16/2020 1
4/17/2020 2
4/18/2020 2
4/19/2020 3
4/20/2020 1
4/21/2020 0
in regular Excel, the formula that calculates the same would be =COUNTIFS(table_cases[creation_date],"<="&[#date],table_cases[closure_date],">"&[#date]), but in my case this will be a part of a bigger pivot table/dashboard and I need to write a measure in DAX that would calculate this.
I would appreciate your help in this matter.
Best regards,
Michal
There are a variety of ways to read the current context. You can use MAX or VALUES or SELECTEDVALUE (if available). These don't all behave exactly the same, so use whichever is appropriate for your situation.
open_cases =
VAR RowDate = MAX ( Calendar[date] )
RETURN
CALCULATE (
COUNT ( table_cases[case_number] ),
table_cases[creation_date] <= RowDate,
table_cases[closure_date] > RowDate
)
Here, MAX is evaluated within the filter context, so it should return the value in the current row (or the maximal date for subtotals and grand totals).

Moving average excluding weekends and holidays

I have a table within PowerPivot currently that tracks a count of customers through our sales pipeline. From (by sales location) first interaction to charged sale. So far, I’ve creates a moving 5-day average that averages each task. Below is the DAX formula I’ve created thus far and an example table.
=
CALCULATE (
SUM ( [Daily Count] ),
DATESINPERIOD ( Table1[Date], LASTDATE ( Table1[Date] ), -7, DAY ),
ALLEXCEPT ( Table1, Table1[Sales Location], Table1[Group] )
)
/ 5
Where I’m struggling is being able to come up with a way to exclude weekends and company observed holidays. Additionally, if a holiday falls on a weekday I would like to remove that from the average and go back an additional day (to smooth the trend).
For example, on 11/26/18 (the Monday after Thanksgiving and Black Friday) I would like to average the five business days previous (11/26/18, 11/21-11/19, and 11/16). In the example above, the moving total and average for the previous 5 days should be Intake = 41 (total) 8.2 (average), Appointment = 30 (total) 6 (average), and Sale = 13 (total) and 2.6 (average).
Based on the formula currently, each of these numbers is inaccurate. Is there an easy way to exclude these days?
Side note: I’ve created an ancillary table with all holidays that is related to the sales data that I have.
Thank you for the help!
For this, I'd recommend using a calendar table related to Table1 on the Date column that also has a column IsWorkday with 1 if that day is a workday and 0 otherwise.
Once you have that set up, you can write a measure like this:
Moving Avg =
VAR Last5Workdays =
SELECTCOLUMNS (
TOPN (
5,
FILTER (
DateTable,
DateTable[Date] <= EARLIER ( Table1[Date] )
&& DateTable[IsWorkday] = 1
),
DateTable[Date], DESC
),
"Workday", DateTable[Date]
)
RETURN
CALCULATE (
SUM ( Table1[Daily Count] ),
Table1[Date] IN Last5Workdays
ALLEXCEPT ( Table1, Table1[Sales Location], Table1[Group] ),
)
/ 5
The TOPN function here returns the top 5 rows of the DateTable where each row must be a workday that is less than or equal to the date in your current Table1 row (the EARLIER function refers to the earlier row context that defines the current row).
I then use SELECTCOLUMNS to turn this table into a list by selecting a single column (which I've named Workday). From there, it's basically your measure with the date filter changed a bit.
#alexisolson Thank you for the response here. I was actually able to figure this out over the weekend but forgot to close out the thread (sorry about that! Appreciate your help either way). But I did something fairly similar to what you mentioned above.
I created a date table (CorpCalendar) that was only inclusive of working days. Then I created an index column within the CorpCalendar table to give each row a unique number in ascending order. From there, I linked the CorpCalendar table to my SalesData table by related dates and used the LOOKUPVALUE function to bring the index value over from the CorpCalendar table to the SalesData table. In a separate column I subtracted 4 from the date index value to get an index adjustment column (for a range of five days from the actual date index and the adjustment...if that makes sense). I then added an additional LOOKUPVALUE helper column to match the adjusted date index column to the appropriate working day.Lastly, I then used the following function to get the 5 day rolling average.
=CALCULATE(sum(Combined[Daily Count]),DATESBETWEEN(Combined[Date - Adjusted],Combined[Date - Adjusted (-5)],Combined[Date - Adjusted]),ALLEXCEPT(Combined,Combined[Group]))/5
This is probably more convoluted than necessary, however, it got me to the answer I was looking for. Let me know if this makes sense and if you have any suggestions for future scenarios like this.
Thanks again!

SamePeriodLastYear by Day

I am looking to calculate the sales for the SAMEPERIODLASTYEAR. I have a table which looks at the sales for the last few years and using the SAMEPERIODLASTYEAR function I can draw back to last January as a whole, but what I am wanting is to pull it to the date exactly and not to the end of the current month last year.
My formula below is pulling through for example all of January last year.
Sales Last Year:=CALCULATE([Sum of Sales],SAMEPERIODLASTYEAR(Dates[Date]))
Is there a way of doing this using just the SAMEPERIODLASTYEAR function rather than indivudual daily calculations?
I do not claim that this is the best (or only) way to do this, but how about:
Sales Last Year =
CALCULATE (
[Sum of Sales],
FILTER (
SAMEPERIODLASTYEAR ( 'Dates'[Date] ),
'Dates'[Date]
< DATE ( YEAR ( TODAY () ) - 1, MONTH ( TODAY () ), DAY ( TODAY () ) )
)
)
Essentially, I've added a FILTER() statement around the SAMEPERIODLASTYEAR clause, and filtered the results to be before the current date last year. If today is 8-Jan-2018, this means any sales on or after 8-Jan-2017 will be filtered out.
In essence, I'm taking the current date (TODAY()), breaking it down into its component parts, subtracting 1 from the year, then building it back up into a date value using the DAX DATE() function.
The reason I do this rather than using DATEADD() to subtract a year Excel-style is because DATEADD requires a table of dates as an input and TODAY() is a single date. You therefore cannot put TODAY() into DATEADD() directly.
Instead of using TODAY(), I could have a measure that identifies the most recent date for which there is data and call that instead. It's the same solution.
I'd be curious if others have a better way to do this. This strikes me as a common problem people must run into a lot.

Powerpivot average, only show a range of rows

I have done an average price calculation for apples. See example
In the new column "testar", I only want to show the year 1990 - 1994 (yellow cells), since the other years are not specified in my formula.
Formula I used for average calculation was:
=CALCULATE (
AVERAGEX (Datasrc; Datasrc[C_P2] );
DATESBETWEEN(Datasrc[Year];"1990-01-01";"1994-01-01")
)
Any ideas or advice how to do that?
You'll need to test the values of Datasrc[Year] in an IF().
Here's a sample using a dummy dataset I have.
Testar =
IF(
MAX( DimDate[Year] ) > 2010
&& MAX( DimDate[Year] ) < 2014
,[SumAmt]
)
We're testing MAX() DimDate[Year]. On any given pivot row, only one year is in context, so max is the year on that pivot row.
The measure you used calculates the average in the context of the 5 years you've defined, overriding whatever the current filter context is.
Additionally, AVERAGEX() is unnecessary in this situation; you can use AVERAGE( Datasrc[C_P2] ).
Almost, thank you for your valuable input and put me in the right direction!!!, since my format of the year table is different and I want to have the average price listed I did a little different. I simply combined your formula with my average formula and got the result I wanted.
See final result here
Code used:
=IF(
MAX( Datasrc[Year2] ) >= 1990
&& MAX( Datasrc[Year2] ) <= 1994
;
CALCULATE (
AVERAGEX (Datasrc; Datasrc[C_P2] );
DATESBETWEEN(Datasrc[Year];"1990-01-01";"1994-01-01")
)
)
The first part of the formula, takes out only the years I'm interested in.
The second part of the formula [CALCULATE(AVERAGEX....] calculates the average of the years I'm interested of.

Excel 2010 Dax Onhand Quantity Vs. Last Date Qty

Ive spent the last 2 days trying to get this, and I really just need a few pointers. Im using Excel 2010 w/ Power Pivot and calculating inventories. I am trying to get the amount sold between 2 dates. I recorded the quantity on hand if the item was in stock.
Item # Day Date Qty
Black Thursday 11/6/2014 2
Blue Thursday 11/6/2014 3
Green Thursday 11/6/2014 3
Black Friday 11/7/2014 2
Green Friday 11/7/2014 2
Black Monday 11/10/2014 3
Blue Monday 11/10/2014 4
Green Monday 11/10/2014 3
Is there a way to do this in dax? I may have to go back and calculate the differences for each record in excel, but Id like to avoid that if possible.
Somethings that have made this hard for me.
1) I only record the inventory Mon-Fri. I am not sure this will always be the case so i'd like to avoid a dependency on this being only weekdays.
2) When there is none in stock, I dont have a record for that day
Ive tried, CALCULATE with dateadd and it gave me results nearly right, but it ended up filtering out some of the results. Really was odd, but almost right.
Any Help is appreciated.
Bryan, this may not totally answer your question as there are a couple of things that aren't totally clear to me but it should give you a start and I'm happy to expand my answer if you provide further info.
One 'pattern' you can use involves the TOPN function which when used with the parameter n=1 can return the earliest or latest value from a table that it sorts by dates and can be filtered to be earlier/later than dates specified.
For this example I am using a 'disconnected' date table from which the user would select the two dates required in a slicer or report filter:
=
CALCULATE (
SUM ( inventory[Qty] ),
TOPN (
1,
FILTER ( inventory, inventory[Date] <= MAX ( dates[Date] ) ),
inventory[Date],
0
)
)
In this case the TOPN returns a single row table of the latest date earlier than or equal to the latest date provided. The 1st argument in the TOPN specifies the number of rows, the second the table to use, the 3rd the column to sort on and the 4th says to sort descending.
From here it is straightforward to adapt this for a second measure that finds the value for the latest date before or equal to the earliest date selected (i.e. swap MIN for MAX in MAX(dates[Date])).
Hope this helps.
Jacob
*prettified using daxformatter.com

Resources