source link
I am trying to come up with a solution to the following problem.
Problem:
In my dataset I have certain quantity of item in demand (need), and purchase orders that re-supply that item(Supply). I need to determine for each demand , what is the first date where we will have enough supply to fill the demand.
For example, if we look at our 1st demand, which require 5 units, according to the cumulative Sum column, 18/12/23 will be the first date when we would have enough qty supplied to satisfy the first demand. The problem appears when we have more the one demand for an item.
If we stay with same item What I would like to do is to update the cumulative Sum when we meet the enough quantity ( as cumulative Sum = cumulative Sum- qty(demand) or 6(cumulative supply)-5(demand) = 1 ) so the cumulative Sum for the next demand will be 100 +1 = 101 and not 100 + 6 = 106. Thereby we can simply rely on the cumulative Sum (updated) to retrieve the first date where we will have enough supply to fill the demand.
I'm not sure if something like this is possibly in Power Query but any help is greatly appreciated.
Hopefully that all makes sense. Thx.
Revised
In powerquery try this as code for Demand
let Source = Excel.CurrentWorkbook(){[Name="DemandDataRange"]}[Content],
#"SupplyGrouped Rows" = Table.Group(Supply, {"item"}, {{"data", each
let a = Table.AddIndexColumn( _ , "Index", 0, 1),
b=Table.AddColumn(a, "CumTotal", each List.Sum(List.FirstN(a[Qty],[Index]+1)))
in b, type table }}),
#"SupplyExpanded data" = Table.ExpandTableColumn(#"SupplyGrouped Rows", "data", { "Supply date", "CumTotal"}, {"Supply date", "CumTotal"}),
#"Grouped Rows" = Table.Group(Source, {"item"}, {{"data", each
let a= Table.AddIndexColumn(_, "Index", 0, 1),
b=Table.AddColumn(a, "CumTotal", each List.Sum(List.FirstN(a[Qty],[Index]+1)))
in b, type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Qty", "Date", "Index", "CumTotal"}, {"Qty", "Date", "Index", "CumTotal"}),
x=Table.AddColumn(#"Expanded data","MaxDate",(i)=>try Table.SelectRows( #"SupplyExpanded data", each [item]=i[item] and [CumTotal]>=i[CumTotal] )[Supply date]{0} otherwise null, type date ),
#"Removed Columns" = Table.RemoveColumns(x,{"Index", "CumTotal"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Date", type date}})
in #"Changed Type"
Given my understanding of what you want for results, the following Power Query M code should return that.
If you just want to compare the total supply vs total demand, then only check the final entries instead of the first non-negative.
Read the code comments, statement names and explore the Applied Steps to understand the algorithm.
let
//Read in the data tables
//could have them in separate querries
Source = Excel.CurrentWorkbook(){[Name="Demand"]}[Content],
Demand = Table.TransformColumnTypes(Source,{{"item", type text}, {"Qty", Int64.Type}, {"Date", type date}}),
//make demand values negative
#"Transform Demand" = Table.TransformColumns(Demand,{"Qty", each _ * -1}),
Source2 = Excel.CurrentWorkbook(){[Name="Supply"]}[Content],
Supply = Table.TransformColumnTypes(Source2,{{"item", type text},{"Qty", Int64.Type},{"Supply date", type date}}),
#"Rename Supply Date Column" = Table.RenameColumns(Supply,{"Supply date","Date"}),
//Merge the tables and sort by Item and Date
Merge = Table.Combine({#"Rename Supply Date Column", #"Transform Demand"}),
#"Sorted Rows" = Table.Sort(Merge,{{"item", Order.Ascending}, {"Date", Order.Ascending}}),
//Group by Item
//Grouped running total to find first positive value
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"item"}, {
{"First Date", (t)=> let
#"Running Total" = List.RemoveFirstN(List.Generate(
()=>[rt=t[Qty]{0}, idx=0],
each [idx]<Table.RowCount(t),
each [rt=[rt]+t[Qty]{[idx]+1}, idx=[idx]+1],
each [rt]),1),
#"First non-negative" = List.PositionOfAny(#"Running Total", List.Select(#"Running Total", each _ >=0), Occurrence.First)
in t[Date]{#"First non-negative"+1}, type date}})
in
#"Grouped Rows"
Supply
Demand
Results
I did this in Excel formula rather than using powerquery - there will be a powerquery equivalent but I'm not very fluent in DAX yet.
You need a helper column - could do without it but everything's much more readable if you have it.
In sheet Supply (2), cell E2, enter the formula:
=SUMIFS(Supply!B:B; Supply!C:C;"<=" & C2;Supply!A:A;A2)-SUMIFS(Dem!B:B;Dem!C:C;"<=" & C2;Dem!A:A;A2)
and copy downwards. This can be described as Total supply up to that date minus total demand up to that date. In some cases this will be negative (where there's more demand than supply).
Now you need to find the date of the first non-negative value for that.
First create a unique list of the items - I put it on the same sheet in the range G2:G6. Then in H2, the formula:
=MINIFS(C:C;A:A;G2;E:E;">=" & 0)
and copy downwards.
Related
Please help!
Ideally, I would really like to solve this using formulas only - not VBA or anything I consider 'fancy'.
I work for a program that awards bonuses for continuous engagement. We have three (sometimes more) engagement time periods that could overlap and/or could have spaces of no engagement. The magic figure is 84 days of continuous engagement. We have been manually reviewing each line (hundreds of lines) to see if the time periods add up to 84 days of continuous engagement, with no periods of inactivity.
In the link there is a pic of a summary of what we work with. Row 3 for example, doesn't have 84 days in any of the 3 time periods, but the first 2 time periods combined includes 120 consecutive days. The dates will not appear in date order - e.g. early engagements may be listed in period 3.
Really looking forward to your advice.
Annie
#TomSharpe has shown you a method of solving this with formulas. You would have to modify it if you had more than three time periods.
Not sure if you would consider a Power Query solution to be "too fancy", but it does allow for an unlimited number of time periods, laid out as you show in the sample.
With PQ, we
construct lists of all the consecutive dates for each pair of start/end
combine the lists for each row, removing the duplicates
apply a gap and island technique to the resulting date lists for each row
count the number of entries for each "island" and return the maximum
Please note: I counted both the start and the end date. In your days columns, you did not (except for one instance). If you want to count both, leave the code as is; if you don't we can make a minor modification
To use Power Query
Create a table which excludes that first row of merged cells
Rename the table columns in the format I show in the screenshot, since each column header in a table must have a different name.
Select some cell in that Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to better understand the algorithm
M Code
code edited to Sort the date lists to handle certain cases
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start P1", type datetime}, {"Comment1", type text}, {"End P1", type datetime}, {"Days 1", Int64.Type}, {"Start P2", type datetime}, {"Comment2", type text}, {"End P2", type datetime}, {"Days 2", Int64.Type}, {"Start P3", type datetime}, {"Comment3", type text}, {"End P3", type datetime}, {"Days 3", Int64.Type}}),
//set data types for columns 1/5/9... and 3/7/11/... as date
dtTypes = List.Transform(List.Alternate(Table.ColumnNames(#"Changed Type"),1,1,1), each {_,Date.Type}),
typed = Table.TransformColumnTypes(#"Changed Type",dtTypes),
//add Index column to define row numbers
rowNums = Table.AddIndexColumn(typed,"rowNum",0,1),
//Unpivot except for rowNum column
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(rowNums, {"rowNum"}, "Attribute", "Value"),
//split the attribute column to filter on Start/End => just the dates
//then filter and remove the attributes columns
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Attribute.1", "Attribute.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Attribute.1", type text}, {"Attribute.2", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Attribute.2"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Attribute.1] = "End" or [Attribute.1] = "Start")),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Attribute.1"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns1",{{"Value", type date}, {"rowNum", Int64.Type}}),
//group by row number
//generate date list from each pair of dates
//combine into a single list of dates with no overlapped date ranges for each row
#"Grouped Rows" = Table.Group(#"Changed Type2", {"rowNum"}, {
{"dateList", (t)=> List.Sort(
List.Distinct(
List.Combine(
List.Generate(
()=>[dtList=List.Dates(
t[Value]{0},
Duration.TotalDays(t[Value]{1}-t[Value]{0})+1 ,
#duration(1,0,0,0)),idx=0],
each [idx] < Table.RowCount(t),
each [dtList=List.Dates(
t[Value]{[idx]+2},
Duration.TotalDays(t[Value]{[idx]+3}-t[Value]{[idx]+2})+1,
#duration(1,0,0,0)),
idx=[idx]+2],
each [dtList]))))}
}),
//determine Islands and Gaps
#"Expanded dateList" = Table.ExpandListColumn(#"Grouped Rows", "dateList"),
//Duplicate the date column and turn it into integers
#"Duplicated Column" = Table.DuplicateColumn(#"Expanded dateList", "dateList", "dateList - Copy"),
#"Changed Type3" = Table.TransformColumnTypes(#"Duplicated Column",{{"dateList - Copy", Int64.Type}}),
//add an Index column
//Then subtract the index from the integer date
// if the dates are consecutive the resultant ID column will => the same value, else it will jump
#"Added Index" = Table.AddIndexColumn(#"Changed Type3", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "ID", each [#"dateList - Copy"]-[Index]),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom",{"dateList - Copy", "Index"}),
//Group by the date ID column and a Count will => the consecutive days
#"Grouped Rows1" = Table.Group(#"Removed Columns2", {"rowNum", "ID"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Removed Columns3" = Table.RemoveColumns(#"Grouped Rows1",{"ID"}),
//Group by the Row number and return the Maximum Consecutive days
#"Grouped Rows2" = Table.Group(#"Removed Columns3", {"rowNum"}, {{"Max Consecutive Days", each List.Max([Count]), type number}}),
//combine the Consecutive Days column with original table
result = Table.Join(rowNums,"rowNum",#"Grouped Rows2","rowNum"),
#"Removed Columns4" = Table.RemoveColumns(result,{"rowNum"})
in
#"Removed Columns4"
Unfortunately Gap and Island seems to be a non-starter, because I don't think you can use it without either VBA or a lot of helper columns, plus the start dates need to be in order. It's a pity, because the longest continuous time on task (AKA largest island) drops out of the VBA version very easily and arguably it's easier to understand than the array formula versions below see this.
Moving on to option 2, if you have Excel 365, you can Use Sequence to generate a list of dates in a certain range, then check that each of them falls in one of the periods of engagement like this:
=LET(array,SEQUENCE(Z$2-Z$1+1,1,Z$1),
period1,(array>=A3)*(array<=C3),
period2,(array>=E3)*(array<=G3),
period3,(array>=I3)*(array<=K3),
SUM(--(period1+period2+period3>0)))
assuming that Z1 and Z2 contain the start and end of the range of dates that you're interested in (I've used 1/1/21 and 31/7/21).
If you don't have Excel 365, you can used the Row function to generate the list of dates instead. I suggest using the Name Manager to create a named range Dates:
=INDEX(Sheet1!$A:$A,Sheet1!$Z$1):INDEX(Sheet1!$A:$A,Sheet1!$Z$2)
Then the formula is:
= SUM(--(((ROW(Dates)>=A3) * (ROW(Dates)<=C3) +( ROW(Dates)>=E3) * (ROW(Dates)<=G3) + (ROW(Dates)>=I3) * (ROW(Dates)<=K3))>0))
You will probably have to enter this using CtrlShiftEnter or use Sumproduct instead of Sum.
EDIT
As #Qualia has perceptively noted, you want the longest time of continuous engagement. This can be found by applying Frequency to the first formula:
=LET(array,SEQUENCE(Z$2-Z$1+1,1,Z$1),
period1,(array>=A3)*(array<=C3),
period2,(array>=E3)*(array<=G3),
period3,(array>=I3)*(array<=K3),
onDays,period1+period2+period3>0,
MAX(FREQUENCY(IF(onDays,array),IF(NOT(onDays),array)))
)
and the non_365 version becomes
=MAX(FREQUENCY(IF((ROW(Dates)>=A3)*(ROW(Dates)<=C3)+(ROW(Dates)>=E3)*(ROW(Dates)<=G3)+(ROW(Dates)>=I3)*(ROW(Dates)<=K3),ROW(Dates)),
IF( NOT( (ROW(Dates)>=A3)*(ROW(Dates)<=C3)+(ROW(Dates)>=E3)*(ROW(Dates)<=G3)+(ROW(Dates)>=I3)*(ROW(Dates)<=K3) ),ROW(Dates))))
I have some system data set where I wanna find comparison between two systems (Uptimum + scrubber), utility time (%) of how much of percentage they were operational during 24h but also if it exceeds 24h.
Data set is below data, but as you can notice - there are dates in Column A (date) gaps there, some days are missing and that will be like that from time to time. But there are also more system instances within one day (system operation can be changed many times per day), that is a reason there is a time in Column B (time column) so I can follow the exact timing of operation within a day.
There is no official "end time" here, it is just ongoing process where operations (systems) are changing/shifting among many other parameters.
What I did is, I extracted dates in Column F so to avoid duplicates and summed them up per system (G2 and H2 Columns), using this functions below and you can see screenshot below too:
=SUMIFS(Explog2021_04_28[T];Explog2021_04_28[D];$F2;Explog2021_04_28[System];"<>"&G$1)-SUMIFS(Explog2021_04_28[T];Explog2021_04_28[D];$F2;Explog2021_04_28[System];G$1)+(INDEX(Explog2021_04_28[System];MATCH($F2;Explog2021_04_28[D]))=G$1)-(INDEX(Explog2021_04_28[System];MATCH($F2;Explog2021_04_28[D];0))<>G$1)*$B2
With this function I summed Columns A and B using extracted values of date and system options.
First thing as you can notice I have minus values as percentage, it shouldnt be there, is that because I have so many gaps in dates? Is there a better way to fix this? As you can see on chart it looks bad..
This shouldnt also exceed 100% of overall usage if that is possible.
Every input would be great from you.
If I understand you correctly, I believe the following Power Query should accomplish what you are looking for.
Please read the code comments and step through the applied steps window to understand the algorithm. Ask if you have questions, and complain if there are logic errors.
I assumed that the system was always in either scrubber or Uptimum
M Code
let
//Read in data. Change table name in next line to reflect actual table name
Source = Excel.CurrentWorkbook(){[Name="systemTable"]}[Content],
//Type the columns
#"Changed Type" = Table.TransformColumnTypes(Source,{{"D", type text}, {"T", type any}, {"System", type text}}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Changed Type", {{"D", type date}}, "en-150"),
#"Changed Type1" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"T", type time}}),
//Combine date and time => datetime
#"Added Custom" = Table.AddColumn(#"Changed Type1", "startTime",
each DateTime.From(Number.From([D]) + Number.From([T])), type datetime),
//create shifted column to be able to quickly refer to previous row
//this method much faster than using an Index column
Base = #"Added Custom",
ShiftedList = List.RemoveFirstN(Table.Column(Base, "startTime"),1) & {null},
Custom1 = Table.ToColumns(Base) & {ShiftedList},
Custom2 = Table.FromColumns(Custom1, Table.ColumnNames(Base) & {"endTime"}),
#"Changed Type2" = Table.TransformColumnTypes(Custom2,{{"endTime", type datetime}}),
//Create a list of dates for each time span
#"Added Custom1" = Table.AddColumn(#"Changed Type2", "datesList", each
let
st = DateTime.Date([startTime]),
et = DateTime.Date([endTime] ),
dur = Duration.TotalDays(et-st)
in
if et=null then {st} else List.Dates(st,dur+1,#duration(1,0,0,0))),
//Expand the list so we have sequential dates (fill in the gaps)
#"Expanded datesList" = Table.ExpandListColumn(#"Added Custom1", "datesList"),
//Remove unneeded columns
#"Removed Columns" = Table.RemoveColumns(#"Expanded datesList",{"D", "T"}),
//change date list datatype to datetime for simpler calculation formula
#"Changed Type3" = Table.TransformColumnTypes(#"Removed Columns",{{"datesList", type datetime}}),
//calculate hours in System each day
#"Added Custom2" = Table.AddColumn(#"Changed Type3", "Hrs in Day",
each List.Min({Date.EndOfDay([datesList]),[endTime]}) - List.Max({[startTime],[datesList]}),Duration.Type),
//Remove unneeded columns
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"startTime", "endTime"}),
//change date list to dates for report
#"Changed Type5" = Table.TransformColumnTypes(#"Removed Columns1",{{"datesList", type date}}),
//Group by Date and System to calculate percent time in system
#"Grouped Rows" = Table.Group(#"Changed Type5", {"datesList", "System"}, {
{"Sum", each List.Sum([Hrs in Day])/#duration(0,24,0,0), Percentage.Type}}),
//Pivot on System to generate final report
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[System]), "System", "Sum", List.Sum),
//Rename the datelist column
#"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"datesList", "D"}})
in
#"Renamed Columns"
Data
Results
There are dates in the cell and times of entering and leaving the factory. I want to calculate how many hours each person has stay in the day they come to the factory. For this, I wrote a macro like this and I defined each person as sicil_no , but since there are multiple entries and exits at different times on the same date, I need to determine the last and first exit times for each day and subtract them. I didnt figure out how to do the last part
Sub macro()
Dim sicil_no As String
Dim i As Integer
Dim end_row As Long
Dim dates As Range
Dim gecis_yonu As String
Dim entry As String
Dim Exits As String
end_row = Cells(Rows.Count, 3).End(xlUp).Row
For i = 3 To end_row
sicil_no = Cells(i, 3).Value
dates = Cells(i, 1).Value
If Range("J", i).Value = "Exit" Then
Range("J", i).Value = exist
End If
If Range("J", i).Value = "Entry" Then
Range("J", i).Value = entry
End If
Next
For Each dates In Range("A", end_row)
Range("M", i).Value = exist - entry
Next
End Sub
One possible way is to use MAXIFS and MINIFS formula to get this result:
It can probably be done better, but if you select A:H and remove duplicates and uncheck column A then you get the result you are looking for I believe.
This assumes the date in column A is a true date and not just a text. If it's not a date then you will need to make it a date.
This can be done using DATEVALUE and RIGHT, LEFT, MID to make the string an accepted date format.
Then in E column you add this formula
=TEXT(A2,"YYYY-MM-DD")
In F:
=MAXIFS(A:A,E:E,TEXT(A2,"YYYY-MM-DD"),B:B,B2)
In G:
=MINIFS(A:A,E:E,TEXT(A2,"YYYY-MM-DD"),B:B,B2)
And lastly in H:
=F2-G2
When all formulas are on the sheet, select everything and copy, paste as values, then use remove duplicates like this:
and the result is this:
EDIT:
For completeness, this is how you convert your date to an accepted date format.
In M2 (example):
=MID(A2,7,4)&"-"&MID(A2,4,2)&"-"&LEFT(A2,2)&" "&RIGHT(A2,8)
then we need to use DATEVALUE and TIMEVALUE on this cell
N2:
=DATEVALUE(M2)+TIMEVALUE(M2)
You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
You did not show what you want for output, but you can add to what I have shown which is the bare minimum Sicil, Date and Time between earliest and latest times. (Assuming each pair of times is entry/exit, you could also sum the differences between each pair of times per day)
In the Query, you can sort the results depending on whether you want to show by date or by employee.
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Add custom column with just the Date part for grouping
#"Added Custom" = Table.AddColumn(Source, "Date", each Date.From([Dates])),
//Group by Sicil No and Date
//Then extract the time in Factory as the last time less the first time
#"Grouped Rows" = Table.Group(#"Added Custom", {"Sicil No", "Date"}, {
{"Hrs in Factory", each List.Max([Dates]) - List.Min([Dates]), type duration}
}),
#"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"Date", type date}})
in
#"Changed Type"
Edit
If you want to add up the actual time in the factory per day, taking into account the entry/exit times:
Assuming times are entered as pairs, where the first time is entry and the second is exit
Merely subtract one from the other to get each duration
The group as above and add the total durations per Sicil and Date
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Add custom column with just the Date part for grouping
#"Added Custom" = Table.AddColumn(Source, "Date", each Date.From([Dates])),
//Add Index column to access previous row
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 0, 1, Int64.Type),
//if the Index number is an Odd number,
// then subtract the previous row from the current row to get the Duration
#"Added Custom1" = Table.AddColumn(#"Added Index", "Duration", each
if Number.Mod([Index],2)=0
then null
else [Dates]- Table.Column(#"Added Index","Dates"){[Index]-1}),
//Group by Sicil and Date
// SUM the durations
#"Grouped Rows" = Table.Group(#"Added Custom1", {"Sicil No", "Date"}, {
{"Time in Factory", each List.Sum([Duration]), type nullable duration}}),
#"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"Sicil No", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
further modification to account for "real" list not being sorted as needed, and also data errors with mismatch of entry/exitsAlso different routine to refer to previous row for speed improvements
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Change type especially datetime to Turkish culture (since I am in US)
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"GECIS TARIHI", type datetime}, {"KART NUMARASI", type any}, {"SICIL NUMARASI", Int64.Type}, {"SOYADI", type text},
{"ADI", type text}, {"FİRMASI", type text}, {"GEÇİÇİ TAŞERON", type any}, {"BÖLÜM KODU", type any},
{"TERMINAL", type any}, {"GEÇİŞ YÖNÜ", type text}, {"GEÇİŞ DURUMU", type any}, {"ZONE", type any}}, "tr-TR"),
//Remove columns that will not appear in final report
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"KART NUMARASI", "SOYADI", "ADI", "FİRMASI", "GEÇİÇİ TAŞERON",
"BÖLÜM KODU", "TERMINAL", "GEÇİŞ DURUMU", "ZONE"}),
//Sort for proper processing
#"Sorted Rows" = Table.Sort(#"Removed Columns",{{"SICIL NUMARASI", Order.Ascending}, {"GECIS TARIHI", Order.Ascending}}),
//add shifted columns to reference previous rows for entry/exit and time
//much faster than using the Index column method
ShiftedList = {null} & List.RemoveLastN(Table.Column(#"Sorted Rows", "GEÇİŞ YÖNÜ"),1),
Custom1 = Table.ToColumns(#"Sorted Rows") & {ShiftedList},
Custom2 = Table.FromColumns(Custom1, Table.ColumnNames(#"Sorted Rows") & {"GEÇİŞ YÖNÜ" & " Prev Row"}),
ShiftedList1 = {null} & List.RemoveLastN(Table.Column(Custom2, "GECIS TARIHI"),1),
Custom3 = Table.ToColumns(Custom2) & {ShiftedList1},
Custom4 = Table.FromColumns(Custom3, Table.ColumnNames(Custom2) & {"GECIS TARIHI" & " Prev Row"}),
//Calculate duration on the appropriate rows
#"Added Custom" = Table.AddColumn(Custom4, "Time in Factory", each
if [GEÇİŞ YÖNÜ] = "Exit" and [GEÇİŞ YÖNÜ Prev Row] = "Entry"
then [GECIS TARIHI] - [GECIS TARIHI Prev Row]
else null),
//Filter out the unneeded rows
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Time in Factory] <> null)),
//Remove the offset columns
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"GEÇİŞ YÖNÜ Prev Row", "GECIS TARIHI Prev Row"}),
//add Date column for grouping
#"Added Custom1" = Table.AddColumn(#"Removed Columns1", "Date", each DateTime.Date([GECIS TARIHI]),Date.Type),
//Group by Date and Sicil and SUM the Time in Factdory
#"Grouped Rows" = Table.Group(#"Added Custom1", {"SICIL NUMARASI", "Date"}, {
{"Time in Factory", each List.Sum([Time in Factory]), type duration}
})
in
#"Grouped Rows"
I have a scenarios, where I have to calculate average price of shares from a set of date. Consider I have following data.
Now I want to represent the data in following format:
Above table will store the average price whenever a new scrip is added in the first table.
I have tried AVERAGEIFS(), but it calculate averages only for a single column range. But I have to calculate average price using price * quantity across the range for the given scrip.
Please suggest.
Not sure I understand the question.
If you're trying to get the total amount base on the average price without a helper column you could use this
=AVERAGEIF($B$3:$E$8,B12,$E$3:$E$8)*SUMIF($B$3:$E$8,B12,$C$3:$C$8)
You can use Power Query (available in Excel 2010+) for this.
In Excel 2016+ (may be different in earlier versions):
select some cell within the data table
Data / Get & Transform / From Table/Range
In the UI, open the Advanced Editor
Paste the M-Code below into the window that opens
Change the Table Name in Line 2 to reflect the actual table name in your worksheet.
NOTE: In the UI, in the Applied Steps window, float your cursor over the information icons to read the comments for explanations. Also you can double click on the gear icons for more information as to how those steps were set up
M Code
let
//Change Table name to correct name
Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Stocks", type text}, {"Quantity", Int64.Type}, {"Date", type date}, {"Price", type number}}),
//Group by Stock
#"Grouped Rows" = Table.Group(#"Changed Type", {"Stocks"}, {{"Grouped", each _, type table [Stocks=nullable text, Quantity=nullable number, Date=nullable date, Price=nullable number]}}),
//Sum quantity for each stock
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Quantity", each List.Sum(Table.Column([Grouped],"Quantity"))),
//Compute weighted average price for each group of stocks
#"Added Custom" = Table.AddColumn(#"Added Custom1", "Price", each List.Accumulate(
List.Positions(Table.Column([Grouped],"Quantity")),
0,
(state, current) =>state + Table.Column([Grouped],"Price"){current} *
Table.Column([Grouped],"Quantity"){current})
/ List.Sum(Table.Column([Grouped],"Quantity"))),
//Compute Total Amount for each stock
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Amount", each [Quantity]*[Price]),
//Remove extraneous Columns
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Grouped"})
in
#"Removed Columns"
Are you allowed to add a column to your data for calculating the total_price? For example, column E = Quantity * Price.
Then your calculations table would be quite simple. Formulas for row 3:
Quantity: =SUMIFS(B:B,A:A,G3)
Average_Price: =SUMIFS(E:E,A:A,G3) / SUMIFS(B:B,A:A,G3)
Amount: =H3*I3
I have a few columns of data, I need to convert the excel version of "PERCENTILE" into Powerquery format.
I have some code which adds in as a function but doesnt apply accurately as it doesnt allow for grouping of the data by CATEGORY and YEAR. So anything that is in Full Discretionary 1.5-2.5 AND 2014 needs to be added to the percentile array, equally anything that falls in Full discretionary 2.5-3.5 AND 2014 needs to go into a different percentile array
let
Source = (list as any, k as number) => let
Source = list,
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Sorted Rows" = Table.Sort(#"Converted to Table",{{"Column1", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "TheIndex", each Table.RowCount(#"Converted to Table")*k/100),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each [Index] >= [TheIndex] and [Index] <= [TheIndex]+1),
Custom1 = List.Average(#"Filtered Rows"[Column1])
in
Custom1
in
Source
So Expected results would be that anything that matches off on the 2 columns (Year,Category) should be applied within the same array. Currently invoking the above function just gives me errors.
I have also tried using grouping and outputting the "Min, Median, and Max" outputs but I also require 10% and 90% Percentiles.
Thank you in advance
Based on some findings on other websites and alot of googling (most folk just want to use DAX but if youre only using Power Query you cant!) someone posted an answer which is very helpful:
https://social.technet.microsoft.com/Forums/en-US/a57bfbea-52d1-4231-b2de-fa993d9bb4c9/can-the-quotpercentilequot-be-calculated-in-power-query?forum=powerquery
Basically:
/PercentileInclusive Function
(inputSeries as list, percentile as number) =>
let
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
Percentile
The above will replicate the PERCENTILE function found within Excel - you pass this as a query using "New Query" and advanced editor. Then call it in after grouping your data -
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
List.Average([Amount Sales]), type number}})
In the above formula, RenamedColumns is the name of the previous step
in the script. Change the name to match your actual case. I've assumed
that the pre-grouping sales amount column is "Amount Sales." Names of
grouped columns are "Sales Total" and "95 Percentile Sales."
Next modify the group formula, substituting List.Average with
PercentileInclusive:
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
PercentileInclusive([Amount Sales],0.95), type number}})
This worked for my data set and matches similar