PowerQuery - Forecast from table - excel

I am trying to create a forecast (single table) for departments to input their assumptions on spending in a single table. Instead of entering amounts for every single month, I would like the user to enter the amount, frequency, start date, and end date for each category. To illustrate, see below the table with some sample data.
This is the result in Power Query (or Power BI) I am trying to get, which is my understanding of how to be able to run date slicers and filters in a Power BI model when comparing against actuals.
If this can't be done with DAX and instead must be done in excel (through look up formulas), how would you structure the formula?

Here is a PQ example that creates what you show as your desired table given what you show as your input:
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to better understand the algorithm
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table9"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"G/L", Int64.Type}, {"Dimension", type text}, {"Description", type text},
{"Amount", Int64.Type}, {"Repeat Every", type text}, {"Start Date", type date}, {"End Date", type date}}),
//Last possible date as Today + 5 years (to end of month)
lastDt = Date.EndOfMonth(Date.AddYears(Date.From(DateTime.FixedLocalNow()),5)),
//Generate list of all possible dates for a given row using List.Generate function
allDates = Table.AddColumn(#"Changed Type", "allDates", each let
lastDate = List.Min({lastDt,[End Date]}),
intvl = {1,3,6}{List.PositionOf({"Monthly","Quarterly","Semi Annual"},[Repeat Every])}
in
List.Generate(
()=> [Start Date],
each _ <= lastDate,
each Date.EndOfMonth(Date.AddMonths(_,intvl)))),
//Remove unneeded columns and expand the list of dates
#"Removed Columns" = Table.RemoveColumns(allDates,{"Repeat Every", "Start Date", "End Date"}),
#"Expanded allDates" = Table.ExpandListColumn(#"Removed Columns", "allDates"),
//Sort to get desired output
// Date column MUST be sorted to ensure correct order when pivoted
// Other columns sorted alphanumerically, but could change the sort to reflect original order if preferred.
#"Sorted Rows" = Table.Sort(#"Expanded allDates",{
{"allDates", Order.Ascending},
{"G/L", Order.Ascending},
{"Dimension", Order.Ascending}}),
//Pivot the date column with no aggregation
#"Pivoted Column" = Table.Pivot(
Table.TransformColumnTypes(#"Sorted Rows", {
{"allDates", type text}}, "en-US"),
List.Distinct(Table.TransformColumnTypes(#"Sorted Rows", {{"allDates", type text}}, "en-US")[allDates]),
"allDates", "Amount")
in
#"Pivoted Column"
Original Data
Results

Related

Merging two tables by the closest date and ID in Power query

I have two excel workbooks that contain information about when a truck departs a depo and one where it is received at another. Each file contains the following in separate columns:
Departure:
Departure Date,
Departure Time,
Truck ID,
Cargo info
Receive:
Arrival Date,
Arrival Time,
Truck ID
How can I merge these two tables so the receiving table can be populated with the cargo from the departure table using Power Query?
As you can see, there is sometimes the same truck on separate trips on the same date, and therefore, it would be great to allocate the cargo based on the closest date and time for a particular truck ID. Clearly a truck cannot arrive before it has left.
I want to try and do this using Power Query, but I have been scratching my head on how to do it. Any help is greatly appreciated.
My two toy data files can be downloaded here
Powerquery version:
Load Departure table, here Table1 into Powerquery
Transform data type for date column to date, and time column to time
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"Truck_ID", type text}, {"Cargo", type text}})
in #"Changed Type"
File .. close and load .. connection only
Load Receive table into Powerquery
Transform data type for date column to date, and time column to time
Add column, custom column
= (i)=> Table.FirstN(Table.Sort(Table.SelectRows(Table1, each [Date] = i[Date] and [Truck_ID] = i[Truck_ID] and [Time]<=i[Time]),{{"Time", Order.Descending}}),1)
What this does is find all the rows from Table1 (departures) with same date, same truck ID, and Time less than or equal to receive time. It then sorts in descending time order, and takes first row, thus finding closest time.
Finally, expand the cargo column using arrows atop the added new column
Full code for Table2 (Receive table)
let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}, {"Truck_ID", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", (i)=> Table.FirstN(Table.Sort(Table.SelectRows(Table1, each [Date] = i[Date] and [Truck_ID] = i[Truck_ID] and [Time]<=i[Time]),{{"Time", Order.Descending}}),1)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Cargo"}, {"Cargo"})
in #"Expanded Custom"
And here's another method using Sorting and Grouping to develop the list
Algorithm explained in code comments
MCode
let
//Generated by the UI if you select to get Data from Folder
//and the folder has only your tables to process.
//If there are other tables, add a Filter to just select the two you want.
Source = Folder.Files("C:\Users\ron\Desktop\BillyJo"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
//Keep only the Source.Name and Transform Files columns
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
//Combine the two tables
combin = Table.Combine(#"Renamed Columns1"[Transform File]),
#"Changed Type" = Table.TransformColumnTypes(combin,{{"Date", type date}, {"Time", type time}}),
//create dateTime column to more easily compare the times
fullDate = Table.AddColumn(#"Changed Type","fullDate", each [Date] & [Time], type datetime),
//sort by full date ascending
#"Sorted Rows" = Table.Sort(fullDate,{{"fullDate", Order.Ascending}}),
//group by Truck_ID, then fill down the Cargo column and extract the Even rows for the Received table
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Truck_ID"}, {
{"all", each Table.AlternateRows(Table.FillDown(_,{"Cargo"}),0,1,1),
type table [Date=nullable date, Time=nullable time, Truck_ID=text, Cargo=nullable text, fullDate=datetime]}
}),
//Remove unneeded column and
//Expand the table produced by the Grouping
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Truck_ID"}),
#"Expanded all" = Table.ExpandTableColumn(#"Removed Columns", "all", {"Date", "Time", "Truck_ID", "Cargo"}, {"Date", "Time", "Truck_ID", "Cargo"})
in
#"Expanded all"
This should do:
Stack the tables
Sort for Truck & Time
= Table.Sort(#"Replaced Value",{{"Truck_ID", Order.Descending},{"Time", Order.Descending}})
Add index column starting with 1, then integer divide it by two
Now you can reference it twice from this point. Filter by Arrivals in one case and filter by Departures by the other, then merge using Index as ID:

Power Query: Calculate date/time instances within and over 1 day and show them as percentages (system utility time %)

I have some system data set where I wanna find comparison between two systems (Uptimum + scrubber), utility time (%) of how much of percentage they were operational during 24h but also if it exceeds 24h.
Data set is below data, but as you can notice - there are dates in Column A (date) gaps there, some days are missing and that will be like that from time to time. But there are also more system instances within one day (system operation can be changed many times per day), that is a reason there is a time in Column B (time column) so I can follow the exact timing of operation within a day.
There is no official "end time" here, it is just ongoing process where operations (systems) are changing/shifting among many other parameters.
What I did is, I extracted dates in Column F so to avoid duplicates and summed them up per system (G2 and H2 Columns), using this functions below and you can see screenshot below too:
=SUMIFS(Explog2021_04_28[T];Explog2021_04_28[D];$F2;Explog2021_04_28[System];"<>"&G$1)-SUMIFS(Explog2021_04_28[T];Explog2021_04_28[D];$F2;Explog2021_04_28[System];G$1)+(INDEX(Explog2021_04_28[System];MATCH($F2;Explog2021_04_28[D]))=G$1)-(INDEX(Explog2021_04_28[System];MATCH($F2;Explog2021_04_28[D];0))<>G$1)*$B2
With this function I summed Columns A and B using extracted values of date and system options.
First thing as you can notice I have minus values as percentage, it shouldnt be there, is that because I have so many gaps in dates? Is there a better way to fix this? As you can see on chart it looks bad..
This shouldnt also exceed 100% of overall usage if that is possible.
Every input would be great from you.
If I understand you correctly, I believe the following Power Query should accomplish what you are looking for.
Please read the code comments and step through the applied steps window to understand the algorithm. Ask if you have questions, and complain if there are logic errors.
I assumed that the system was always in either scrubber or Uptimum
M Code
let
//Read in data. Change table name in next line to reflect actual table name
Source = Excel.CurrentWorkbook(){[Name="systemTable"]}[Content],
//Type the columns
#"Changed Type" = Table.TransformColumnTypes(Source,{{"D", type text}, {"T", type any}, {"System", type text}}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Changed Type", {{"D", type date}}, "en-150"),
#"Changed Type1" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"T", type time}}),
//Combine date and time => datetime
#"Added Custom" = Table.AddColumn(#"Changed Type1", "startTime",
each DateTime.From(Number.From([D]) + Number.From([T])), type datetime),
//create shifted column to be able to quickly refer to previous row
//this method much faster than using an Index column
Base = #"Added Custom",
ShiftedList = List.RemoveFirstN(Table.Column(Base, "startTime"),1) & {null},
Custom1 = Table.ToColumns(Base) & {ShiftedList},
Custom2 = Table.FromColumns(Custom1, Table.ColumnNames(Base) & {"endTime"}),
#"Changed Type2" = Table.TransformColumnTypes(Custom2,{{"endTime", type datetime}}),
//Create a list of dates for each time span
#"Added Custom1" = Table.AddColumn(#"Changed Type2", "datesList", each
let
st = DateTime.Date([startTime]),
et = DateTime.Date([endTime] ),
dur = Duration.TotalDays(et-st)
in
if et=null then {st} else List.Dates(st,dur+1,#duration(1,0,0,0))),
//Expand the list so we have sequential dates (fill in the gaps)
#"Expanded datesList" = Table.ExpandListColumn(#"Added Custom1", "datesList"),
//Remove unneeded columns
#"Removed Columns" = Table.RemoveColumns(#"Expanded datesList",{"D", "T"}),
//change date list datatype to datetime for simpler calculation formula
#"Changed Type3" = Table.TransformColumnTypes(#"Removed Columns",{{"datesList", type datetime}}),
//calculate hours in System each day
#"Added Custom2" = Table.AddColumn(#"Changed Type3", "Hrs in Day",
each List.Min({Date.EndOfDay([datesList]),[endTime]}) - List.Max({[startTime],[datesList]}),Duration.Type),
//Remove unneeded columns
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"startTime", "endTime"}),
//change date list to dates for report
#"Changed Type5" = Table.TransformColumnTypes(#"Removed Columns1",{{"datesList", type date}}),
//Group by Date and System to calculate percent time in system
#"Grouped Rows" = Table.Group(#"Changed Type5", {"datesList", "System"}, {
{"Sum", each List.Sum([Hrs in Day])/#duration(0,24,0,0), Percentage.Type}}),
//Pivot on System to generate final report
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[System]), "System", "Sum", List.Sum),
//Rename the datelist column
#"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"datesList", "D"}})
in
#"Renamed Columns"
Data
Results

In Excel or Power BI, how can I assign 'ongoing project' to each date in a year, when I only have the start and end date of each project?

I have two tables (see attached workbook).
One with the names of all projects in first column, the start date of the project in that row in the second column and the end date of that project in the third column.
The other table has a column with all the dates in a year. I want to add several columns to it. My question is how to get the one that I coloured yellow in the workbook. That column should contain the project that will be/was in process for each day of the year.
I hope the workbook will illustrate my problem.
Sneak peak:
Table one
Project ID
Start Date
End Date
A
2/1/2020
3/1/2020
B
5/1/2020
10/1/2020
Etc.
Etc.
Etc.
Table two
Each Date in a year
Ongoing project
1/1/2020
2/1/2020
A
3/1/2020
A
4/1/2020
5/1/2020
B
Etc.
Etc.
So far I have tried several approaches: Index/match, xlookup, dynamic arrays.
Edit:
Excel Wizard (YouTube) provided a solution that helped me out.
=TEXTJOIN(",",,REPT(TableOne[Project ID],([#Each Date in a year]>=TableOne[Start Date])*(#Each Date in a year]<=TableOne[End Date])))
In Power Query you could:
Transform your table 1 into a table where each ProjectID/Date has a single row
Create a second table consisting of all the dates in the time period
Join the two tables with a JoinKind.FullOuter
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Project ID", type text}, {"Start Date", type date}, {"End Date", type date}}),
//show one row for each projectID/Date
#"Added Custom" = Table.AddColumn(#"Changed Type", "dtRange", each
List.Dates([Start Date], Duration.TotalDays([End Date] - [Start Date]) + 1,#duration(1,0,0,0))),
#"Expanded dtRange" = Table.ExpandListColumn(#"Added Custom", "dtRange"),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded dtRange",{{"dtRange", type date}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Start Date", "End Date"}),
//not sure what you want for the calendar range
//but you can set it in the next two steps
dtStart = #date(2021,12,3),
calDays = 365,
dtTbl = Table.TransformColumnTypes(
Table.FromList(
List.Dates(dtStart,calDays,#duration(1,0,0,0)),
Splitter.SplitByNothing(),{"Dates"},null,ExtraValues.Error),
{{"Dates", type date}}),
//combine the two tables
joinTbl = Table.Join(dtTbl,"Dates",#"Removed Columns","dtRange",JoinKind.FullOuter),
#"Removed Columns1" = Table.RemoveColumns(joinTbl,{"dtRange"}),
#"Sorted Rows" = Table.Sort(#"Removed Columns1",{{"Dates", Order.Ascending}}),
#"Renamed Columns" = Table.RenameColumns(#"Sorted Rows",{{"Project ID", "Ongoing Project"}})
in
#"Renamed Columns"
Sample Data
Results note that multiple non-project date rows are hidden

Calculate average of a range with multiple column

I have a scenarios, where I have to calculate average price of shares from a set of date. Consider I have following data.
Now I want to represent the data in following format:
Above table will store the average price whenever a new scrip is added in the first table.
I have tried AVERAGEIFS(), but it calculate averages only for a single column range. But I have to calculate average price using price * quantity across the range for the given scrip.
Please suggest.
Not sure I understand the question.
If you're trying to get the total amount base on the average price without a helper column you could use this
=AVERAGEIF($B$3:$E$8,B12,$E$3:$E$8)*SUMIF($B$3:$E$8,B12,$C$3:$C$8)
You can use Power Query (available in Excel 2010+) for this.
In Excel 2016+ (may be different in earlier versions):
select some cell within the data table
Data / Get & Transform / From Table/Range
In the UI, open the Advanced Editor
Paste the M-Code below into the window that opens
Change the Table Name in Line 2 to reflect the actual table name in your worksheet.
NOTE: In the UI, in the Applied Steps window, float your cursor over the information icons to read the comments for explanations. Also you can double click on the gear icons for more information as to how those steps were set up
M Code
let
//Change Table name to correct name
Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Stocks", type text}, {"Quantity", Int64.Type}, {"Date", type date}, {"Price", type number}}),
//Group by Stock
#"Grouped Rows" = Table.Group(#"Changed Type", {"Stocks"}, {{"Grouped", each _, type table [Stocks=nullable text, Quantity=nullable number, Date=nullable date, Price=nullable number]}}),
//Sum quantity for each stock
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Quantity", each List.Sum(Table.Column([Grouped],"Quantity"))),
//Compute weighted average price for each group of stocks
#"Added Custom" = Table.AddColumn(#"Added Custom1", "Price", each List.Accumulate(
List.Positions(Table.Column([Grouped],"Quantity")),
0,
(state, current) =>state + Table.Column([Grouped],"Price"){current} *
Table.Column([Grouped],"Quantity"){current})
/ List.Sum(Table.Column([Grouped],"Quantity"))),
//Compute Total Amount for each stock
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Amount", each [Quantity]*[Price]),
//Remove extraneous Columns
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Grouped"})
in
#"Removed Columns"
Are you allowed to add a column to your data for calculating the total_price? For example, column E = Quantity * Price.
Then your calculations table would be quite simple. Formulas for row 3:
Quantity: =SUMIFS(B:B,A:A,G3)
Average_Price: =SUMIFS(E:E,A:A,G3) / SUMIFS(B:B,A:A,G3)
Amount: =H3*I3

excel powerquery replacevalues based on cell

I'm extracting data from a site using excel powerquery, the site is http://www.timeanddate.com/holidays/south-africa/2014
The web table presents dates in the format mmm dd so;
Jan 01
Mar 20
Mar 21 ...etc.
To get results for different years I can invoke a prompt to request year input and replace the relevant value in the URL as follows;
= let
#"Table 0" = (myParm)=>
let
Source = Web.Page(Web.Contents("http://www.timeanddate.com/holidays/south-africa/" & Number.ToText(myParm))),
However - without the year specified in the web results table, when imported into excel it understandably plonks its own values in (Excel native just uses current year being 2015, powerquery interprets the info completely differently) alla such;
2001/01/01
2020/03/01
2021/03/01
herewith the questions:
I want to be able to specify the year in the query using a cell, replacing the myParm with a cell value and refreshing on change (can do it with excel native, need to know how to do it with powerquery)
I want to be able to replace the year value on the resultant year column data with whatever is in the aforementioned cell
For #1, assuming you have an Excel Table named YearTable with a single column named Year and a single detail row with the required year value (e.g. 2015), you can use this M expression:
Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year]
This dives into that table and plucks the value from the first detail row.
For example, you could embed that in your opening Step e.g.
Web.Page(Web.Contents("http://www.timeanddate.com/holidays/south-africa/" & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year])))
For #2, I would add use that expression to Add a Column using something like this formula:
[Date] & " " & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year])
Then you can use the Parse button (Transform ribbon, under Date) to convert that to a Date datatype if required.
Note the generated Change Type step I got from that page cast Date as a Date with an implied year (the issue you noticed). Just edit the formula for that step, to set the "Date" column as "text" to avoid that.
Here's my entire test M script:
let
Source = Web.Page(Web.Contents("http://www.timeanddate.com/holidays/south-africa/" & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year]))),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Header", type text}, {"Date", type text}, {"Weekday", type text}, {"Holiday name", type text}, {"Holiday type", type text}}),
#"Added Derived Date" = Table.AddColumn(#"Changed Type", "Derived Date", each [Date] & " " & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year])),
#"Parsed Date" = Table.TransformColumns(#"Added Derived Date",{{"Derived Date", each Date.From(DateTimeZone.From(_)), type date}})
in
#"Parsed Date"
Another solution from Colin Banfield;
1) In Excel, create a table with Year as the column name and enter the year as the row value. Then create a query from the table. Your query should have one column and one row value. Name the query appropriately and save.
2) Get the data from the web site. Assume we name the query HolidayTable. Convert the query to a function query e.g.
(Year as number)=>
let
Source = Web.Page(Web.Contents("www.timeanddate.com/holidays/south-africa/"&Number.ToText(Year))),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Header", type text}, {"Date", type date}, {"Weekday", type text}, {"Holiday name", type text}, {"Holiday type", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Header"})
in
#"Removed Columns"
3) Add this function as a new column in the step (1) query, and add a new date custom column. After a couple other transformations (column reorder, column removal), you should end up with a query that looks like the following:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Year", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each HolidayTable([Year])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Date", "Weekday", "Holiday name", "Holiday type"}, {"Date", "Weekday", "Holiday name", "Holiday type"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Calendar Date", each #date([Year],Date.Month([Date]),Date.Day([Date]))),
#"Reordered Columns" = Table.ReorderColumns(#"Added Custom1",{"Year", "Date", "Calendar Date", "Weekday", "Holiday name", "Holiday type"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Year","Date"})
in
#"Removed Columns"
Notes:
a) The first two lines are from the original table query in step (1).
b) The #"Added Custom" step adds a new custom column, which passes the value in the Year column to the HolidayTable function
c) The #"Added Custom1" step adds a new custom column that creates a new date from the value in the Year column, and the month and day values from the original Date column.

Resources