Related
In my Power Query I have a column that shows different durations on certain items, but it displays an error when attempting to convert on time or duration.
As a solution next to my Excel Table I created a formula that alows to convert the duration in the format I wish to use, but I have not been able to translate the formula into a language that Power Query can understand (I am pretty new to Power Query).
This is how the data is pulled from source:
But I will like it to show like this:
The Excel Formula I am using to accomplish this is:
=IF(LEN([#Age])=7,"0"&[#Age],IF(LEN([#Age])=5,"00:"&[#Age],IF(LEN([#Age])=4,"00:0"&[#Age],IF(LEN([#Age])=3,"00:00"&[#Age],[#Age]))))
It will be nice to have it in the Power Query instead of the Excel sheet, as it serves as a learning oportunity.
I am self learning Power Query in Excel so any help is welcomed.
EDIT: In Case of the duration being more than 24:00:00, how will i approach it
Here is the error code it returns
You can add a custom column with the formula:
Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
3),
":"))
The formula
Splits the text string by the colon into a List
Replacing blanks with {00} and also prepend the list with a {00} element
Retrieve the last three elements and combine them into a colon separated text string.
Use Duration.FromText function to convert to a duration.
Set the data type of the column to duration
In the PQ Editor, a duration will have the format of d.hh:mm:ss, but when you load it back into Excel, you can change that to [hh]:mm:ss
You can accomplish the above all in the PQ User Interface.
Here is M-Code that does the same thing:
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Duration", each Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
3),
":"))),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Age"})
in
#"Removed Columns"
You can even do it (using M-Code in the Advanced Editor) without adding a column by using the Table.TransformColumns function:
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Change to Duration" = Table.TransformColumns(#"Changed Type",
{"Age", each Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split(_,":"),"","00",Replacer.ReplaceValue),
3),
":")), type duration})
in
#"Change to Duration"
All result in:
Edit
With your modified data, now showing duration values of more than 23 hours (not allowed in a duration literal in PQ), the transformation will be different. We have to check the hours and break it into days and hours if it is more than 23.
Note: the below edit also assumes there will never be anything entered in the day location; and that entries for minutes and seconds will always be within range. If there might be day values, you will need to just add what's there to the "overflow" from the hours entry
So we change the Custom Column formula to check for that:
let
split = List.LastN({"00","00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),4),
s = Number.From(List.Last(split)),
m = Number.From(List.LastN(split,2){0}),
hTotal = Number.From(List.LastN(split,3){0}),
h = Number.Mod(hTotal,24),
d = Number.IntegerDivide(hTotal,24)
in #duration(d,h,m,s)
If you might have illegal values for minutes or seconds, you can add logig to check for that also
Also, if you will be loading this into Excel, and you might have total days >31, you will need to format it (in Excel), as [hh]:mm:ss as with the format d.hh:mm:ss Excel cannot display more than 31 days (although the proper value will be stored in the cell)
I have a data set which consists of Date/Time, Pressure and Custom Column. This represents pressure over time data, where I wanna know my starting point (after 5 minutes) and ending point of -before last value (row) within one month. To help you a bit out, usually the measurements are taking roughly 30-40 mins what you can see on this example down. So it means the amount of data can vary.
The Time column is calculated using:
=([#[Date/Time]]-I5)*1440+L5
This data set represents whole data and all the months with values, and I need separated (filtered) months with these starting/ending points as on the screenshot. I used Power Query a lot to play with data, but maybe there is another method to obtain those values...and make them dynamic when possible for future data.
I will also upload my dummy workbook with whole data set (all the months), filter table with months if needed for your infos and test.
https://docs.google.com/spreadsheets/d/1LGl-eri6ewCni2NJ2wGeoYIf-40KO2Lr/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
In Power Query:
Based on your shared workbook and what you have written, it seems that for any given month, you
edit: minor change in algorithm
start the minute count after excluding the first entry in the month.
If that is a typo/error, just remove the function that removes that first line
with that second entry = minute 0, return the first entry in or after minute 5 as well as the next to last entry in the table.
Note that I started with just the Date and Pressure columns
Algorithm
Add a column of monthYear
GroupBy monthYear
Custom aggregation to
Remove the first and last rows of the table
Create a list of durations in minutes of each time compared with the first time in month. This will be a minute + fraction of a minute
Add that list as a column to the original table
Determine the first entry in or after the fifth minute
Determine the last entry
Filter the month subtable to return those two entries.
If you want to see the result for just a given month, you can filter the result in the resultant Excel table.
M Code
please read the comments and examine the Applied Steps to better understand the algorithm
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//add month/year column for grouping
#"Added Custom" = Table.AddColumn(#"Changed Type", "month Year",
each Number.ToText(Date.Month([#"Date/Time"]),"00") & Number.ToText(Date.Year([#"Date/Time"]),"0000")),
#"Grouped Rows" = Table.Group(#"Added Custom", {"month Year"}, {
//elapsed minutes column
{"Elapsed Minutes", (x)=> let
//remove first and last rows from table
t=Table.RemoveColumns(Table.RemoveFirstN(Table.RemoveLastN(x)),"month Year"),
//add a column with the elapsed minutes
TableToFilter = Table.FromColumns(
Table.ToColumns(t)
& {List.Generate(
()=>[em=null, idx=0],
each [idx]< Table.RowCount(t),
each [em=Duration.TotalMinutes(t[#"Date/Time"]{[idx]+1} - t[#"Date/Time"]{0}), idx=[idx]+1],
each [em])}, type table[#"Date/Time"=datetime, #"P7 [mbar]"=number, elapsed=number]),
//filter for last entry (which would be next to last in the month
maxMinute = List.Max(TableToFilter[elapsed]),
//filter for first entry in the 5th minute
fifthMinute = List.Select(TableToFilter[elapsed], each Number.IntegerDivide(_,1)>=5){0},
//select the 5th minute and the last row
FilteredTable = Table.SelectRows(TableToFilter, each [elapsed]=fifthMinute or [elapsed]=maxMinute)
in FilteredTable,type table[#"Date/Time"=datetime, #"P7 [mbar]"=number, elapsed=number]}
}),
//remove uneeded column and expand the others
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"month Year"}),
#"Expanded Elapsed Minutes" = Table.ExpandTableColumn(#"Removed Columns", "Elapsed Minutes", {"Date/Time", "P7 [mbar]"}, {"Date/Time", "P7 [mbar]"})
in
#"Expanded Elapsed Minutes"
Results from your shared workbook data
In Office/Excel 365
Filter Column (eg for January 2020)
E4: 1/1/2020
E5: 1/1/2020
Results
F4 (date/time 5th minute): =IF(COUNTIFS(Table1[Date/Time],">="&E4,Table1[Date/Time],"<" & EDATE(E4,1))=0,"",
LET(x,FILTER(Table1[Date/Time],(Table1[Date/Time]>=E4)*(Table1[Date/Time]<EDATE(E4,1))),
y, (x-INDEX(x,2))*1440,
z, XMATCH(5,y,1),
INDEX(x,z,1)))
G4: (Pressure 5th minute): =IF(F4="","",
LET(x,FILTER(Table1,(Table1[Date/Time]>=E4)*(Table1[Date/Time]<EDATE(E4,1))),
y, (INDEX(x,0,1)-INDEX(x,2,1))*1440,
z, XMATCH(5,y,1),
INDEX(x,z,2)))
F5: (Date next to last): =IF(COUNTIFS(Table1[Date/Time],">="&E5,Table1[Date/Time],"<" & EDATE(E5,1))=0,"",
LET(x,FILTER(Table1[Date/Time],(Table1[Date/Time]>=E5)*(Table1[Date/Time]<EDATE(E5,1))),
INDEX(x,COUNT(x)-1)))
G5: (Pressure next to last):=IF(F5="","",
LET(x,FILTER(Table1,(Table1[Date/Time]>=E5)*(Table1[Date/Time]<EDATE(E5,1))),
INDEX(x,COUNT(INDEX(x,0,1))-1,2)))
I am looking to insert a remove column step which removes any column where the header (which is a date) is before a certain date (older than X years prior to the current date). I receive a large data dump which is just a list of client names and fees they pay each month from 2012 to today, headed by the month they pay each fee, but as time goes on I don't need the oldest of the data.
So far I have tried producing a list from the headers (based on a previous response from another board member - thankyou #horseyride!) and then removing the columns which dont meet the criteria FROM that list. However it keeps breaking.
This is the latest line in the advanced Editor
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Calendar Period", type text}}, "en-GB"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Calendar Period", type text}}, "en-GB")[#"Calendar Period"]), "Calendar Period", "Approved Invoice Amount", List.Sum)
This are the lines i am attempting to create:
"ColumnList" = List.Select(Table.ColumnNames(#"Pivoted Column"), each Text.Contains(_, " ")),,
"Delete Columns"= Table.Transform(#"Pivoted Column", Table.RemoveColumns(#"ColumnList", each {})as table)
in
#"Delete Columns"
the Second bit of code I cant seem to get right - that is what I believe it should look like for now. But essentially i want the table to remove any columns where their header (a date) is prior to X amount of years older than todays date.
EDIT - Screenshot of before and after IF the desired cut off was Dec 2012:
Example Data
Thank you in advance
Just use following code. For static date:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
final = Table.SelectColumns(Source, List.Select(Table.ColumnNames(Source),
each try Date.From(_) >= #date(2012,12,1) otherwise true))
in
final
For dynamic date (older than 3 years prior to the current date):
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
final = Table.SelectColumns(Source, List.Select(Table.ColumnNames(Source),
each try Date.From(_) >= Date.AddYears(Date.From(DateTime.FixedLocalNow()),-3)
otherwise true))
in
final
I am trying to calculate number of days for particular year based on calendar table that i have created.
For Example: I have 3 columns.
Event, number of days and Date when this event started
Event DaysLost
Injury 30 25/12/2016
Injury 588 06/08/2012
Days in 2016 - 6
Days in 2017 - 24
For the second case:
Days in 2012 - 146
Days in 2013 - 365
Days in 2014 - 77
Now for above case there are only 6 days which need to be counted in 2016 and the rest of the days should automatically be counted in 2017. But i cannot figure out how to do it.
In my output i would like to put years in one column and days lost for year in front of that particular year.
I have a calendar table and i want sum of days to populate for a particular year.
I tried calculating it by getting end date, by adding number of days to First start date and then if days were more that remaining days in that year. subtract remaining days from total days and remaining days should move to next year. But i cannot figure out how to keep adding days for next years if days extends for many years and list them after words.
Sept 4, 2017
Please see the excel solution below
Excel solution of the problem
0) Importing the data from your Excel screenshot into Power BI results in this.
1) Create a new column in that table using the following formula for end date (to help with future formulas).
EndDate = Injuries[First Start Date] + Injuries[Days]
You stated that you have a calendar table, so you can skip to step 3
2) Create a new table by clicking on Modeling -> New Table and entering the following formula. This gives a single column table with a list of years.
Years = GENERATESERIES(2000, 2020, 1)
3) Create another new table using the following formula. This gives a table with all of the fields from the initial data table crossjoined with the Year table that was just created. The formula also filters the resulting table to only return rows where the value in the Year column is between the First Start Date and the First Start Date plus Days. To learn more about the CROSSJOIN function, check of the documentation here.
InjuriesByYear = FILTER(
CROSSJOIN(Years, Injuries),
Years[Year] >= Injuries[First Start Date].[Year] &&
Years[Year] <= Injuries[EndDate].[Year]
)
4) Create relationships from the InjuriesByYear table back to the initial data table and the Year table. This will help facilitate nicer reporting efforts.
5) In the InjuriesByYear table, create a new column by clicking on Modeling -> New Column and entering the following formula. The first IF checks if all of the days lost are in a single year. The second IF handles when the days are spread across multiple years, with the True clause handling the first year, and the False clause handling all other years.
DayPerYear = IF(
InjuriesByYear[Year] = InjuriesByYear[First Start Date].[Year] && InjuriesByYear[Year] = InjuriesByYear[EndDate].[Year], InjuriesByYear[Days],
IF(
InjuriesByYear[Year] = InjuriesByYear[First Start Date].[Year], DATEDIFF(InjuriesByYear[First Start Date], DATE(InjuriesByYear[First Start Date].[Year], 12, 31), DAY),
DATEDIFF(DATE(InjuriesByYear[Year], 1, 1), MIN(InjuriesByYear[EndDate], DATE(InjuriesByYear[Year], 12, 31)), DAY) + 1
)
)
6) To test it all out, create a pivot table as configured in below. Following these steps, the pivot table should match your Excel solution.
This is a Power Query based approach...
I started with this:
Then I added a custom column by clicking the Add Column tab and Custom Column button and completing the pop-up window like this:
...and clicking OK.
Then I changed the type for that new column by selecting it and then clicking the Transform tab and then Data Type and Date.
Then I added another custom column, completing the pop-up like this:
Then I added another custom column, completing the pop-up like this:
Then I added yet another custom column, completing the pop-up like this:
Then I expanded that last column I added by clicking on the at the top of the column and Expand to New Rows.
Then I added a final custom column, completing the pop-up like this:
Finally, I grouped by the Event, DaysLost, Started, and Year columns and summed the DaysLostForYear column by clicking the Transform tab and Group By button and completing the pop-up like this:
I end up with this:
You might want a different grouping, but this should get you close. It shows how many days were lost in the years associated with each instance of an injury's total days lost. For instance, the first injury, which was 30 days in duration, started on 12/25/2016: 7 of those days occurred in 2016 and 23 in 2017. The second injury was 588 days, started on 8/6/2012: 148 days were in 2012, 365 in 2013, and 75 in 2014.
Note that I count the started date as a lost day.
Note also that I account for leap years.
I hope this helps.
Here's the query code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Event", type text}, {"DaysLost", Int64.Type}, {"Started", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Ended", each Date.AddDays([Started],[DaysLost]-1)),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Ended", type date}}),
#"Added Custom3" = Table.AddColumn(#"Changed Type1", "DaysYearStarted", each Number.From(Date.From(Text.From(Date.Year([Started]))&"/12/31")-[Started])+1),
#"Added Custom4" = Table.AddColumn(#"Added Custom3", "DaysYearEnded", each Number.From([Ended]-Date.From(Text.From(Date.Year([Ended])-1)&"/12/31"))),
#"Added Custom5" = Table.AddColumn(#"Added Custom4", "Year", each List.Numbers(Date.Year([Started]),Date.Year([Ended])-Date.Year([Started])+1)),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom5", "Year"),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "DaysLostForYear", each if [Year]=Date.Year([Started]) then [DaysYearStarted] else
if [Year]=Date.Year([Ended]) then [DaysYearEnded] else
if Date.IsLeapYear([Year]) then 366 else 365),
#"Grouped Rows" = Table.Group(#"Added Custom1", {"Event", "DaysLost", "Started", "Year"}, {{"DaysLostForYear", each List.Sum([DaysLostForYear]), type number}})
in
#"Grouped Rows"
I have a table which looks like
Date
9/4/2016
9/11/2016
9/18/2016
9/25/2016
10/2/2016
10/9/2016
10/16/2016
10/23/2016
10/30/2016
11/6/2016
11/13/2016
11/20/2016
11/20/2016
I'm trying to assign unique index values to 'Date column' but couldn't do it using the 'Add custom index value' in power query which doesn't check duplication. Also I tried "Date.WeekOfYear" which gives number based on year, but I want to assign unique numbers from 1 to .... for dates like
Date Custom_weeknumber
9/4/2016 1
9/11/2016 2
9/18/2016 3
9/25/2016 4
10/2/2016 5
10/9/2016 6
10/16/2016 7
10/23/2016 8
10/30/2016 9
11/6/2016 10
11/13/2016 11
11/20/2016 12
11/20/2016 12
Any help would be helpful, thanks!
Assuming:
Your dates are sorted.
The row after duplicates will get the Custom_weeknumber from the duplicates + 1.
Then you can group by dates (with New column name e.g. "DateGroups" and Oparation "All Rows"), add an index column, expand the "DateGroups" field and remove the "DateGroups" field.
Code example created in Power Query in Excel:
let
Source = Excel.CurrentWorkbook(){[Name="Dates"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Date", type date}}),
Grouped = Table.Group(Typed, {"Date"}, {{"DateGroups", each _, type table}}),
Numbered = Table.AddIndexColumn(Grouped, "Custom_weeknumber", 1, 1),
Expanded = Table.ExpandTableColumn(Numbered, "DateGroups", {"Date"}, {"DateGroups"}),
Removed = Table.RemoveColumns(Expanded,{"DateGroups"})
in
Removed
I'd do it this way (which seem to me a bit simplier, I don't like nested tables unless absolutely needed):
Group By Date column
Optional: Sort
Add index
Join source table
Code:
let
//Source = Excel.CurrentWorkbook(){[Name="YourTableName"]}[Content],
Source = #table(type table [Date = date], {{#date(2016, 10, 12)}, {#date(2016, 10, 13)}, {#date(2016,10,14)}, {#date(2016, 10, 14)}}),
GroupBy = Table.RemoveColumns(Table.Group(Source, "Date", {"tmp", each null, type any}), {"tmp"}),
//Optional: sort to ensure values are ordered
Sort = Table.Sort(GroupBy,{{"Date", Order.Ascending}}),
Index = Table.AddIndexColumn(Sort, "Custom_weeknumber", 1, 1),
JoinTables = Table.Join(Source, {"Date"}, Index, {"Date"}, JoinKind.Inner)
in
JoinTables