date difference in one column based on second column value Power Query - excel

I need to find the number of days between maximum and minimum date based on whenever there is a value in Tonnage.For example for the name K701 I need to find out number of days from 12/09/2022 till 30/09/2022 with the condition that tonnage is more than 0

In powerquery, filter tonnage, group on name, subtract the min from max date
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Filtered Rows" = Table.SelectRows(Source, each ([Tonage] <> 0)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Name"}, {{"days", each Number.From(List.Max(_[Shift.2]))-Number.From(List.Min(_[Shift.2]) ) }})
in #"Grouped Rows"

Related

The Date Where There Is Enough Supply To Satisfy Dem

source link
I am trying to come up with a solution to the following problem.
Problem:
In my dataset I have certain quantity of item in demand (need), and purchase orders that re-supply that item(Supply). I need to determine for each demand , what is the first date where we will have enough supply to fill the demand.
For example, if we look at our 1st demand, which require 5 units, according to the cumulative Sum column, 18/12/23 will be the first date when we would have enough qty supplied to satisfy the first demand. The problem appears when we have more the one demand for an item.
If we stay with same item What I would like to do is to update the cumulative Sum when we meet the enough quantity ( as cumulative Sum = cumulative Sum- qty(demand) or 6(cumulative supply)-5(demand) = 1 ) so the cumulative Sum for the next demand will be 100 +1 = 101 and not 100 + 6 = 106. Thereby we can simply rely on the cumulative Sum (updated) to retrieve the first date where we will have enough supply to fill the demand.
I'm not sure if something like this is possibly in Power Query but any help is greatly appreciated.
Hopefully that all makes sense. Thx.
Revised
In powerquery try this as code for Demand
let Source = Excel.CurrentWorkbook(){[Name="DemandDataRange"]}[Content],
#"SupplyGrouped Rows" = Table.Group(Supply, {"item"}, {{"data", each
let a = Table.AddIndexColumn( _ , "Index", 0, 1),
b=Table.AddColumn(a, "CumTotal", each List.Sum(List.FirstN(a[Qty],[Index]+1)))
in b, type table }}),
#"SupplyExpanded data" = Table.ExpandTableColumn(#"SupplyGrouped Rows", "data", { "Supply date", "CumTotal"}, {"Supply date", "CumTotal"}),
#"Grouped Rows" = Table.Group(Source, {"item"}, {{"data", each
let a= Table.AddIndexColumn(_, "Index", 0, 1),
b=Table.AddColumn(a, "CumTotal", each List.Sum(List.FirstN(a[Qty],[Index]+1)))
in b, type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Qty", "Date", "Index", "CumTotal"}, {"Qty", "Date", "Index", "CumTotal"}),
x=Table.AddColumn(#"Expanded data","MaxDate",(i)=>try Table.SelectRows( #"SupplyExpanded data", each [item]=i[item] and [CumTotal]>=i[CumTotal] )[Supply date]{0} otherwise null, type date ),
#"Removed Columns" = Table.RemoveColumns(x,{"Index", "CumTotal"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Date", type date}})
in #"Changed Type"
Given my understanding of what you want for results, the following Power Query M code should return that.
If you just want to compare the total supply vs total demand, then only check the final entries instead of the first non-negative.
Read the code comments, statement names and explore the Applied Steps to understand the algorithm.
let
//Read in the data tables
//could have them in separate querries
Source = Excel.CurrentWorkbook(){[Name="Demand"]}[Content],
Demand = Table.TransformColumnTypes(Source,{{"item", type text}, {"Qty", Int64.Type}, {"Date", type date}}),
//make demand values negative
#"Transform Demand" = Table.TransformColumns(Demand,{"Qty", each _ * -1}),
Source2 = Excel.CurrentWorkbook(){[Name="Supply"]}[Content],
Supply = Table.TransformColumnTypes(Source2,{{"item", type text},{"Qty", Int64.Type},{"Supply date", type date}}),
#"Rename Supply Date Column" = Table.RenameColumns(Supply,{"Supply date","Date"}),
//Merge the tables and sort by Item and Date
Merge = Table.Combine({#"Rename Supply Date Column", #"Transform Demand"}),
#"Sorted Rows" = Table.Sort(Merge,{{"item", Order.Ascending}, {"Date", Order.Ascending}}),
//Group by Item
//Grouped running total to find first positive value
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"item"}, {
{"First Date", (t)=> let
#"Running Total" = List.RemoveFirstN(List.Generate(
()=>[rt=t[Qty]{0}, idx=0],
each [idx]<Table.RowCount(t),
each [rt=[rt]+t[Qty]{[idx]+1}, idx=[idx]+1],
each [rt]),1),
#"First non-negative" = List.PositionOfAny(#"Running Total", List.Select(#"Running Total", each _ >=0), Occurrence.First)
in t[Date]{#"First non-negative"+1}, type date}})
in
#"Grouped Rows"
Supply
Demand
Results
I did this in Excel formula rather than using powerquery - there will be a powerquery equivalent but I'm not very fluent in DAX yet.
You need a helper column - could do without it but everything's much more readable if you have it.
In sheet Supply (2), cell E2, enter the formula:
=SUMIFS(Supply!B:B; Supply!C:C;"<=" & C2;Supply!A:A;A2)-SUMIFS(Dem!B:B;Dem!C:C;"<=" & C2;Dem!A:A;A2)
and copy downwards. This can be described as Total supply up to that date minus total demand up to that date. In some cases this will be negative (where there's more demand than supply).
Now you need to find the date of the first non-negative value for that.
First create a unique list of the items - I put it on the same sheet in the range G2:G6. Then in H2, the formula:
=MINIFS(C:C;A:A;G2;E:E;">=" & 0)
and copy downwards.

Aggregate multiple (many!) pair of columns (Exce)

I have table; The table consists of pairs of date and value columns
Pair Pair Pair Pair .... ..... ......
What I need is the sum of all values for the same date.
The total table has 3146 columns (so 1573 pairs of value and date)!! with up to 186 entries on row level.
Thankfully, the first column contains all possible date values.
Considering the 3146 columns I am not sure how to do that without doing massivle amount of small steps :(
This shows a different method of creating the two column table that you will group by Date and return the Sum. Might be faster than the List.Accumulate method. Certainly worth a try in view of your comment above.
Unpivot the original table
Add 0-based Index column; then IntegerDivide by 2
Group by the IntegerDivide column and extract the Date and Value to separate columns.
Then group by date and aggregate by sum
let
Source = Excel.CurrentWorkbook(){[Name="Table12"]}[Content],
//assuming only columns are Date and Value, this will set the data types for any number of columns
Types = List.Transform(List.Alternate(Table.ColumnNames(Source),1,1,1), each {_, type date}) &
List.Transform(List.Alternate(Table.ColumnNames(Source),1,1,0), each {_, type number}),
#"Changed Type" = Table.TransformColumnTypes(Source,Types),
//Unpivot all columns to create a two column table
//The Value.1 table will alternate the related Date/Value
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {}, "Attribute", "Value.1"),
//add a column to group the pairs of values
//below two lines => a column in sequence of 0,0,1,1,2,2,3,3, ...
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1, Int64.Type),
#"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "Integer-Division", each Number.IntegerDivide([Index], 2), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Integer-Division",{"Index"}),
// Group by the "pairing" sequence,
// Extract the Date and Value to new columns
// => a 2 column table
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Integer-Division"}, {
{"Date", each [Value.1]{0}, type date},
{"Value", each [Value.1]{1}, type number}}),
#"Removed Columns1" = Table.RemoveColumns(#"Grouped Rows",{"Integer-Division"}),
//Group by Date and aggregate by Sum
#"Grouped Rows1" = Table.Group(#"Removed Columns1", {"Date"}, {{"Sum Values", each List.Sum([Value]), type number}}),
//Sort into date order
#"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"Date", Order.Ascending}})
in
#"Sorted Rows"
Quick google shows "Number of columns per table 16,384" for powerquery and 16000 for powerBI, so I'm thinking you have to split your input data somehow first, or perhaps this is not the tool for you, maybe AWK
Assuming that works, an M version of what you are looking for. It stacks the columns in groups of 2, then groups and sums them
let Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
Combo = List.Split(Table.ColumnNames(Source),2),
#"Added Custom" =List.Accumulate(
Combo,
#table({"Column1"}, {}),
(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1)
),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Column1"}, {{"Sum", each List.Sum([Column2]), type number}})
in #"Grouped Rows"
186 rows * 1573 pairs of columns = 292,578 records.
Assuming not a very old version of Excel, 293k rows is fine, so it can be done with formulae:
Insert five columns to the left, so data starts in F3.
In A3 put zero, in A4 put 1, select the two and drag down to A188.
In A189 put =A3.
In B3 put 0, and drag down to B188.
In B189 put =B3
"Drag"* down A189 and B189 to row 292580
In C3 put =OFFSET($F$3,A3,B3)
In D3 put =OFFSET($F$3,A3,B3+1)
Select those two cells and click on the cross at bottom right to copy them to the end of column B.
Then put Date and Value in A1 and B1, and use a Pivot Table to get totals, averages, or whatever you need.
Any blank cells in the original input do not matter.
to "drag" down hundred of thousands of cells:
Copy A189 and B189
Goto (F5) A292580
Paste
Pin (F8)
CTRL-up arrow
Enter
And rather than $F$3 I would name that cell Origin, and use "Origin" in the two Offset formulae, but many people seem to consider that too complicated.

I have 3 time periods in excel - I need to know the duration of the longest continuous period

Please help!
Ideally, I would really like to solve this using formulas only - not VBA or anything I consider 'fancy'.
I work for a program that awards bonuses for continuous engagement. We have three (sometimes more) engagement time periods that could overlap and/or could have spaces of no engagement. The magic figure is 84 days of continuous engagement. We have been manually reviewing each line (hundreds of lines) to see if the time periods add up to 84 days of continuous engagement, with no periods of inactivity.
In the link there is a pic of a summary of what we work with. Row 3 for example, doesn't have 84 days in any of the 3 time periods, but the first 2 time periods combined includes 120 consecutive days. The dates will not appear in date order - e.g. early engagements may be listed in period 3.
Really looking forward to your advice.
Annie
#TomSharpe has shown you a method of solving this with formulas. You would have to modify it if you had more than three time periods.
Not sure if you would consider a Power Query solution to be "too fancy", but it does allow for an unlimited number of time periods, laid out as you show in the sample.
With PQ, we
construct lists of all the consecutive dates for each pair of start/end
combine the lists for each row, removing the duplicates
apply a gap and island technique to the resulting date lists for each row
count the number of entries for each "island" and return the maximum
Please note: I counted both the start and the end date. In your days columns, you did not (except for one instance). If you want to count both, leave the code as is; if you don't we can make a minor modification
To use Power Query
Create a table which excludes that first row of merged cells
Rename the table columns in the format I show in the screenshot, since each column header in a table must have a different name.
Select some cell in that Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to better understand the algorithm
M Code
code edited to Sort the date lists to handle certain cases
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start P1", type datetime}, {"Comment1", type text}, {"End P1", type datetime}, {"Days 1", Int64.Type}, {"Start P2", type datetime}, {"Comment2", type text}, {"End P2", type datetime}, {"Days 2", Int64.Type}, {"Start P3", type datetime}, {"Comment3", type text}, {"End P3", type datetime}, {"Days 3", Int64.Type}}),
//set data types for columns 1/5/9... and 3/7/11/... as date
dtTypes = List.Transform(List.Alternate(Table.ColumnNames(#"Changed Type"),1,1,1), each {_,Date.Type}),
typed = Table.TransformColumnTypes(#"Changed Type",dtTypes),
//add Index column to define row numbers
rowNums = Table.AddIndexColumn(typed,"rowNum",0,1),
//Unpivot except for rowNum column
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(rowNums, {"rowNum"}, "Attribute", "Value"),
//split the attribute column to filter on Start/End => just the dates
//then filter and remove the attributes columns
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Attribute.1", "Attribute.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Attribute.1", type text}, {"Attribute.2", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Attribute.2"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Attribute.1] = "End" or [Attribute.1] = "Start")),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Attribute.1"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns1",{{"Value", type date}, {"rowNum", Int64.Type}}),
//group by row number
//generate date list from each pair of dates
//combine into a single list of dates with no overlapped date ranges for each row
#"Grouped Rows" = Table.Group(#"Changed Type2", {"rowNum"}, {
{"dateList", (t)=> List.Sort(
List.Distinct(
List.Combine(
List.Generate(
()=>[dtList=List.Dates(
t[Value]{0},
Duration.TotalDays(t[Value]{1}-t[Value]{0})+1 ,
#duration(1,0,0,0)),idx=0],
each [idx] < Table.RowCount(t),
each [dtList=List.Dates(
t[Value]{[idx]+2},
Duration.TotalDays(t[Value]{[idx]+3}-t[Value]{[idx]+2})+1,
#duration(1,0,0,0)),
idx=[idx]+2],
each [dtList]))))}
}),
//determine Islands and Gaps
#"Expanded dateList" = Table.ExpandListColumn(#"Grouped Rows", "dateList"),
//Duplicate the date column and turn it into integers
#"Duplicated Column" = Table.DuplicateColumn(#"Expanded dateList", "dateList", "dateList - Copy"),
#"Changed Type3" = Table.TransformColumnTypes(#"Duplicated Column",{{"dateList - Copy", Int64.Type}}),
//add an Index column
//Then subtract the index from the integer date
// if the dates are consecutive the resultant ID column will => the same value, else it will jump
#"Added Index" = Table.AddIndexColumn(#"Changed Type3", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "ID", each [#"dateList - Copy"]-[Index]),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom",{"dateList - Copy", "Index"}),
//Group by the date ID column and a Count will => the consecutive days
#"Grouped Rows1" = Table.Group(#"Removed Columns2", {"rowNum", "ID"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Removed Columns3" = Table.RemoveColumns(#"Grouped Rows1",{"ID"}),
//Group by the Row number and return the Maximum Consecutive days
#"Grouped Rows2" = Table.Group(#"Removed Columns3", {"rowNum"}, {{"Max Consecutive Days", each List.Max([Count]), type number}}),
//combine the Consecutive Days column with original table
result = Table.Join(rowNums,"rowNum",#"Grouped Rows2","rowNum"),
#"Removed Columns4" = Table.RemoveColumns(result,{"rowNum"})
in
#"Removed Columns4"
Unfortunately Gap and Island seems to be a non-starter, because I don't think you can use it without either VBA or a lot of helper columns, plus the start dates need to be in order. It's a pity, because the longest continuous time on task (AKA largest island) drops out of the VBA version very easily and arguably it's easier to understand than the array formula versions below see this.
Moving on to option 2, if you have Excel 365, you can Use Sequence to generate a list of dates in a certain range, then check that each of them falls in one of the periods of engagement like this:
=LET(array,SEQUENCE(Z$2-Z$1+1,1,Z$1),
period1,(array>=A3)*(array<=C3),
period2,(array>=E3)*(array<=G3),
period3,(array>=I3)*(array<=K3),
SUM(--(period1+period2+period3>0)))
assuming that Z1 and Z2 contain the start and end of the range of dates that you're interested in (I've used 1/1/21 and 31/7/21).
If you don't have Excel 365, you can used the Row function to generate the list of dates instead. I suggest using the Name Manager to create a named range Dates:
=INDEX(Sheet1!$A:$A,Sheet1!$Z$1):INDEX(Sheet1!$A:$A,Sheet1!$Z$2)
Then the formula is:
= SUM(--(((ROW(Dates)>=A3) * (ROW(Dates)<=C3) +( ROW(Dates)>=E3) * (ROW(Dates)<=G3) + (ROW(Dates)>=I3) * (ROW(Dates)<=K3))>0))
You will probably have to enter this using CtrlShiftEnter or use Sumproduct instead of Sum.
EDIT
As #Qualia has perceptively noted, you want the longest time of continuous engagement. This can be found by applying Frequency to the first formula:
=LET(array,SEQUENCE(Z$2-Z$1+1,1,Z$1),
period1,(array>=A3)*(array<=C3),
period2,(array>=E3)*(array<=G3),
period3,(array>=I3)*(array<=K3),
onDays,period1+period2+period3>0,
MAX(FREQUENCY(IF(onDays,array),IF(NOT(onDays),array)))
)
and the non_365 version becomes
=MAX(FREQUENCY(IF((ROW(Dates)>=A3)*(ROW(Dates)<=C3)+(ROW(Dates)>=E3)*(ROW(Dates)<=G3)+(ROW(Dates)>=I3)*(ROW(Dates)<=K3),ROW(Dates)),
IF( NOT( (ROW(Dates)>=A3)*(ROW(Dates)<=C3)+(ROW(Dates)>=E3)*(ROW(Dates)<=G3)+(ROW(Dates)>=I3)*(ROW(Dates)<=K3) ),ROW(Dates))))

Excel calculate value for new category based on other group categories

I am getting data from a database that is provided in long format and I need to get ratios from values that are given different categories. E.g. I want the average price based on revenues and quantity sold.
Is there an easy way to calculate this in a pivot once I have the data?
My MWE would look like this
And I woul like to calculate the new rows with the category price
One way would probably to do this in MS SQL beforehand, but I am not that skilled with that and I need my colleagues to be able to do this in Excel themselves.
In Power Query, you can
Group the Rows by Year
From the resultant tables, divide the 1st Value by the 2nd.
Paste the code below into the Advanced Editor; and change the table name in Line 2 to reflect the actual table name of your data. Then you can explore the "Applied Steps" in the UI to see how the code was generated.
Changing the data table will change the Query results, but you will need to "Refresh" the query. This can be done form the Ribbon; or you can create a Button on the worksheet.
M-Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"Year"}, {{"Grouped", each _, type table [Year=number, Category=text, Value=number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Price",
each Table.Column([Grouped],"Value"){0} /
Table.Column([Grouped],"Value"){1})
in
#"Added Custom"
Edit: From your comments, it seems you might have more than just Revenue/Quantity pairs of categories for each year. And I suppose it possible you might have more than a single Revenue/Quantity pair.
Below is code that will take that into account; breaking the Quantity and Revenue from each year into two columns, then dividing one by the other which would result in a weighted average price for each year:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//needed only if you have blank rows in the table
#"Filtered Rows" = Table.SelectRows(Source, each ([Year] <> null)),
//Group by Year
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Year"}, {{"Grouped", each _, type table [Year=number, Category=text, Value=number]}}),
//Extract Revenue and Quantity into two new columns of Lists
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Revenue", each Table.Column(Table.SelectRows([Grouped], each ([Category] = "Revenue")),"Value")),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Quantity", each Table.Column(Table.SelectRows([Grouped], each ([Category] = "Quantity")),"Value")),
//Sum the value for each List of Revenue and divide by each in the List of Quantity
//This will result in a weighted average if there is more than one Revenue/Quantity pair in a year
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Price", each List.Sum([Revenue]) / List.Sum([Quantity])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Grouped", "Revenue", "Quantity"}),
//Some cleanup
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Year", Int64.Type}, {"Price", Currency.Type}})
in
#"Changed Type"

Charting average sales per weekday on data composed of hours

I'm using PowerBI desktop and I'm creating a chart to display average sales per weekday:
My data is in the format below:
(sampled in Excel to remove sensitive information, added colors to facilitate visualization)
My problem is: since each day is broken in 24 rows (hours), my average is wrong by a factor of 24.
For example, if I select January-2019 in the slicer, which has five Tuesdays (weekday code: 2), I want to see on the bar number 2:
(sum of amount where weekday = 2) / 5
Instead, I'm calculating:
(sum of amount where weekday = 2) / (24 * 5)
I can think of some ways to get this right, but they involve custom columns or auxiliary tables. I'm sure there is a simpler answer using DAX and measures, but I'm still learning it.
How can I correctly calculate this?
Let's assume your table name is "Data". Create 3 DAX measures (not calculated columns):
Measure 1:
Total Amount = SUM(Data[Amount])
Measure 2:
Number of Days = DISTINCTCOUNT(Data[Date])
Measure 3:
Average Amount per Day = DIVIDE( [Total Amount], [Number of Days])
Drop the last measure into a chart, it should give you the expected result.
As I understand from your excel you are working with 3 different columns. You can better combine this to a datetime and let power-bi handle it.
Below m-language will do this for you:
let
Source = Excel.Workbook(File.Contents("C:\....\Test.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"date", type datetime}, {"hour", type time}, {"amount", type number}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Date", each [date]+ Duration.FromText(Time.ToText([hour]))),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"amount", "Date"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([amount] <> 0))
in
#"Filtered Rows"
The trick is in the added column: #"Added Custom" = Table.AddColumn(#"Changed Type", "Date", each [date]+ Duration.FromText(Time.ToText([hour])))
Here I add the time to the date.
I also removed the empty (zero amount) rows, you do not need them.
I added the Date & weekday to the Axis so a user can now drill down from year, month, day to weekday.
Be aware you need to do the SUM of the amount, not the average.

Resources