I'm using PowerBI desktop and I'm creating a chart to display average sales per weekday:
My data is in the format below:
(sampled in Excel to remove sensitive information, added colors to facilitate visualization)
My problem is: since each day is broken in 24 rows (hours), my average is wrong by a factor of 24.
For example, if I select January-2019 in the slicer, which has five Tuesdays (weekday code: 2), I want to see on the bar number 2:
(sum of amount where weekday = 2) / 5
Instead, I'm calculating:
(sum of amount where weekday = 2) / (24 * 5)
I can think of some ways to get this right, but they involve custom columns or auxiliary tables. I'm sure there is a simpler answer using DAX and measures, but I'm still learning it.
How can I correctly calculate this?
Let's assume your table name is "Data". Create 3 DAX measures (not calculated columns):
Measure 1:
Total Amount = SUM(Data[Amount])
Measure 2:
Number of Days = DISTINCTCOUNT(Data[Date])
Measure 3:
Average Amount per Day = DIVIDE( [Total Amount], [Number of Days])
Drop the last measure into a chart, it should give you the expected result.
As I understand from your excel you are working with 3 different columns. You can better combine this to a datetime and let power-bi handle it.
Below m-language will do this for you:
let
Source = Excel.Workbook(File.Contents("C:\....\Test.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"date", type datetime}, {"hour", type time}, {"amount", type number}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Date", each [date]+ Duration.FromText(Time.ToText([hour]))),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"amount", "Date"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([amount] <> 0))
in
#"Filtered Rows"
The trick is in the added column: #"Added Custom" = Table.AddColumn(#"Changed Type", "Date", each [date]+ Duration.FromText(Time.ToText([hour])))
Here I add the time to the date.
I also removed the empty (zero amount) rows, you do not need them.
I added the Date & weekday to the Axis so a user can now drill down from year, month, day to weekday.
Be aware you need to do the SUM of the amount, not the average.
Related
Please help!
Ideally, I would really like to solve this using formulas only - not VBA or anything I consider 'fancy'.
I work for a program that awards bonuses for continuous engagement. We have three (sometimes more) engagement time periods that could overlap and/or could have spaces of no engagement. The magic figure is 84 days of continuous engagement. We have been manually reviewing each line (hundreds of lines) to see if the time periods add up to 84 days of continuous engagement, with no periods of inactivity.
In the link there is a pic of a summary of what we work with. Row 3 for example, doesn't have 84 days in any of the 3 time periods, but the first 2 time periods combined includes 120 consecutive days. The dates will not appear in date order - e.g. early engagements may be listed in period 3.
Really looking forward to your advice.
Annie
#TomSharpe has shown you a method of solving this with formulas. You would have to modify it if you had more than three time periods.
Not sure if you would consider a Power Query solution to be "too fancy", but it does allow for an unlimited number of time periods, laid out as you show in the sample.
With PQ, we
construct lists of all the consecutive dates for each pair of start/end
combine the lists for each row, removing the duplicates
apply a gap and island technique to the resulting date lists for each row
count the number of entries for each "island" and return the maximum
Please note: I counted both the start and the end date. In your days columns, you did not (except for one instance). If you want to count both, leave the code as is; if you don't we can make a minor modification
To use Power Query
Create a table which excludes that first row of merged cells
Rename the table columns in the format I show in the screenshot, since each column header in a table must have a different name.
Select some cell in that Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to better understand the algorithm
M Code
code edited to Sort the date lists to handle certain cases
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start P1", type datetime}, {"Comment1", type text}, {"End P1", type datetime}, {"Days 1", Int64.Type}, {"Start P2", type datetime}, {"Comment2", type text}, {"End P2", type datetime}, {"Days 2", Int64.Type}, {"Start P3", type datetime}, {"Comment3", type text}, {"End P3", type datetime}, {"Days 3", Int64.Type}}),
//set data types for columns 1/5/9... and 3/7/11/... as date
dtTypes = List.Transform(List.Alternate(Table.ColumnNames(#"Changed Type"),1,1,1), each {_,Date.Type}),
typed = Table.TransformColumnTypes(#"Changed Type",dtTypes),
//add Index column to define row numbers
rowNums = Table.AddIndexColumn(typed,"rowNum",0,1),
//Unpivot except for rowNum column
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(rowNums, {"rowNum"}, "Attribute", "Value"),
//split the attribute column to filter on Start/End => just the dates
//then filter and remove the attributes columns
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Attribute.1", "Attribute.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Attribute.1", type text}, {"Attribute.2", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Attribute.2"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Attribute.1] = "End" or [Attribute.1] = "Start")),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Attribute.1"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns1",{{"Value", type date}, {"rowNum", Int64.Type}}),
//group by row number
//generate date list from each pair of dates
//combine into a single list of dates with no overlapped date ranges for each row
#"Grouped Rows" = Table.Group(#"Changed Type2", {"rowNum"}, {
{"dateList", (t)=> List.Sort(
List.Distinct(
List.Combine(
List.Generate(
()=>[dtList=List.Dates(
t[Value]{0},
Duration.TotalDays(t[Value]{1}-t[Value]{0})+1 ,
#duration(1,0,0,0)),idx=0],
each [idx] < Table.RowCount(t),
each [dtList=List.Dates(
t[Value]{[idx]+2},
Duration.TotalDays(t[Value]{[idx]+3}-t[Value]{[idx]+2})+1,
#duration(1,0,0,0)),
idx=[idx]+2],
each [dtList]))))}
}),
//determine Islands and Gaps
#"Expanded dateList" = Table.ExpandListColumn(#"Grouped Rows", "dateList"),
//Duplicate the date column and turn it into integers
#"Duplicated Column" = Table.DuplicateColumn(#"Expanded dateList", "dateList", "dateList - Copy"),
#"Changed Type3" = Table.TransformColumnTypes(#"Duplicated Column",{{"dateList - Copy", Int64.Type}}),
//add an Index column
//Then subtract the index from the integer date
// if the dates are consecutive the resultant ID column will => the same value, else it will jump
#"Added Index" = Table.AddIndexColumn(#"Changed Type3", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "ID", each [#"dateList - Copy"]-[Index]),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom",{"dateList - Copy", "Index"}),
//Group by the date ID column and a Count will => the consecutive days
#"Grouped Rows1" = Table.Group(#"Removed Columns2", {"rowNum", "ID"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Removed Columns3" = Table.RemoveColumns(#"Grouped Rows1",{"ID"}),
//Group by the Row number and return the Maximum Consecutive days
#"Grouped Rows2" = Table.Group(#"Removed Columns3", {"rowNum"}, {{"Max Consecutive Days", each List.Max([Count]), type number}}),
//combine the Consecutive Days column with original table
result = Table.Join(rowNums,"rowNum",#"Grouped Rows2","rowNum"),
#"Removed Columns4" = Table.RemoveColumns(result,{"rowNum"})
in
#"Removed Columns4"
Unfortunately Gap and Island seems to be a non-starter, because I don't think you can use it without either VBA or a lot of helper columns, plus the start dates need to be in order. It's a pity, because the longest continuous time on task (AKA largest island) drops out of the VBA version very easily and arguably it's easier to understand than the array formula versions below see this.
Moving on to option 2, if you have Excel 365, you can Use Sequence to generate a list of dates in a certain range, then check that each of them falls in one of the periods of engagement like this:
=LET(array,SEQUENCE(Z$2-Z$1+1,1,Z$1),
period1,(array>=A3)*(array<=C3),
period2,(array>=E3)*(array<=G3),
period3,(array>=I3)*(array<=K3),
SUM(--(period1+period2+period3>0)))
assuming that Z1 and Z2 contain the start and end of the range of dates that you're interested in (I've used 1/1/21 and 31/7/21).
If you don't have Excel 365, you can used the Row function to generate the list of dates instead. I suggest using the Name Manager to create a named range Dates:
=INDEX(Sheet1!$A:$A,Sheet1!$Z$1):INDEX(Sheet1!$A:$A,Sheet1!$Z$2)
Then the formula is:
= SUM(--(((ROW(Dates)>=A3) * (ROW(Dates)<=C3) +( ROW(Dates)>=E3) * (ROW(Dates)<=G3) + (ROW(Dates)>=I3) * (ROW(Dates)<=K3))>0))
You will probably have to enter this using CtrlShiftEnter or use Sumproduct instead of Sum.
EDIT
As #Qualia has perceptively noted, you want the longest time of continuous engagement. This can be found by applying Frequency to the first formula:
=LET(array,SEQUENCE(Z$2-Z$1+1,1,Z$1),
period1,(array>=A3)*(array<=C3),
period2,(array>=E3)*(array<=G3),
period3,(array>=I3)*(array<=K3),
onDays,period1+period2+period3>0,
MAX(FREQUENCY(IF(onDays,array),IF(NOT(onDays),array)))
)
and the non_365 version becomes
=MAX(FREQUENCY(IF((ROW(Dates)>=A3)*(ROW(Dates)<=C3)+(ROW(Dates)>=E3)*(ROW(Dates)<=G3)+(ROW(Dates)>=I3)*(ROW(Dates)<=K3),ROW(Dates)),
IF( NOT( (ROW(Dates)>=A3)*(ROW(Dates)<=C3)+(ROW(Dates)>=E3)*(ROW(Dates)<=G3)+(ROW(Dates)>=I3)*(ROW(Dates)<=K3) ),ROW(Dates))))
I have a scenarios, where I have to calculate average price of shares from a set of date. Consider I have following data.
Now I want to represent the data in following format:
Above table will store the average price whenever a new scrip is added in the first table.
I have tried AVERAGEIFS(), but it calculate averages only for a single column range. But I have to calculate average price using price * quantity across the range for the given scrip.
Please suggest.
Not sure I understand the question.
If you're trying to get the total amount base on the average price without a helper column you could use this
=AVERAGEIF($B$3:$E$8,B12,$E$3:$E$8)*SUMIF($B$3:$E$8,B12,$C$3:$C$8)
You can use Power Query (available in Excel 2010+) for this.
In Excel 2016+ (may be different in earlier versions):
select some cell within the data table
Data / Get & Transform / From Table/Range
In the UI, open the Advanced Editor
Paste the M-Code below into the window that opens
Change the Table Name in Line 2 to reflect the actual table name in your worksheet.
NOTE: In the UI, in the Applied Steps window, float your cursor over the information icons to read the comments for explanations. Also you can double click on the gear icons for more information as to how those steps were set up
M Code
let
//Change Table name to correct name
Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Stocks", type text}, {"Quantity", Int64.Type}, {"Date", type date}, {"Price", type number}}),
//Group by Stock
#"Grouped Rows" = Table.Group(#"Changed Type", {"Stocks"}, {{"Grouped", each _, type table [Stocks=nullable text, Quantity=nullable number, Date=nullable date, Price=nullable number]}}),
//Sum quantity for each stock
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Quantity", each List.Sum(Table.Column([Grouped],"Quantity"))),
//Compute weighted average price for each group of stocks
#"Added Custom" = Table.AddColumn(#"Added Custom1", "Price", each List.Accumulate(
List.Positions(Table.Column([Grouped],"Quantity")),
0,
(state, current) =>state + Table.Column([Grouped],"Price"){current} *
Table.Column([Grouped],"Quantity"){current})
/ List.Sum(Table.Column([Grouped],"Quantity"))),
//Compute Total Amount for each stock
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Amount", each [Quantity]*[Price]),
//Remove extraneous Columns
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Grouped"})
in
#"Removed Columns"
Are you allowed to add a column to your data for calculating the total_price? For example, column E = Quantity * Price.
Then your calculations table would be quite simple. Formulas for row 3:
Quantity: =SUMIFS(B:B,A:A,G3)
Average_Price: =SUMIFS(E:E,A:A,G3) / SUMIFS(B:B,A:A,G3)
Amount: =H3*I3
I want to get the location of each RefID based on the smallest positive number. The answer for the example below is:
Table
RefID Location Number
vT70SAAS Sixth Floor 311.39
wmhXXAAY Sixth Floor 35.57
wm2xcAAA Rooftop 7.55
I tried =MIN(IF(F27:F30>0,F27:F30)) as array, but it gets the overall min positive. I need one for each RefID
Sample data:
Number Location RefID
-3.50 Basement wmhXXAAY
-32.39 First Floor wm2xcAAA
524.71 Second Floor vT70SAAS
-7.19 Second Floor wm2xcAAA
61.81 Third Floor wm2xcAAA
150.63 Third Floor wmhXXAAY
467.76 Fifth Floor wm2xcAAA
102.30 Fifth Floor wmhXXAAY
311.39 Sixth Floor vT70SAAS
35.57 Sixth Floor wmhXXAAY
521.51 Rooftop vT70SAAS
7.55 Rooftop wm2xcAAA
244.54 Rooftop wm2xcAAA
Table
One more option:
=AGGREGATE(15,6,(1/(($A$2:$A$14>0)*($C$2:$C$14=F2)))*$A$2:$A$14,1)
A pivot table is the best way. Especially if you are going to deal with larger datasets.
Make one from your data. Add RefID as the ROW and Min(Number) as DATA
Then - If you want to add a value back to your original table (in another, adjacent column) - you could do a lookup on the pivot table.
=VLOOKUP(A1,F1:G5,2)
Where F1:G5 is the range of the pivot table.
If, for some reason, you need the VLOOKUP to work without being able to manually select the pivottable, then see this: https://answers.microsoft.com/en-us/msoffice/forum/msoffice_excel-mso_other/use-pivot-table-name-in-a-formula/17aa5ad3-eee4-4a5c-a70c-a9296853066d
Just add another condition to the IF():
With data starting in the second row, array enter:
=MIN(IF((F2:F14>0)*(H2:H14="vT70SAAS"),F2:F14))
Note we use multiply rather than AND()
(do the same for the other IDs)
You can also do this using Power Query, available in Excel 2010+
Most steps can be done from the UI.
The Table renaming can be done manually, but I customized the code a bit in case you wind up adding extra columns.
Algorithm
Filter the table for values > 0
Group Rows by RefID, with Operation = All Rows
Add a Custom Column with formula Table.Min([Grouped],"Number") to extract the RefID with the lowest Number
Expand the Resultant column
ReOrder the columns into the desired final order
Remove the Extra columns
Rename the resultant columns.
M-Code
You may need to change the Table Name in Line 2
let
Source = Excel.CurrentWorkbook(){[Name="refIDtbl"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Number", type number}, {"Location", type text}, {"RefID", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Number] >= 0),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"RefID"}, {{"Grouped", each _, type table [Number=number, Location=text, RefID=text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Min([Grouped],"Number")),
#"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"Number", "Location", "RefID"}, {"Custom.Number", "Custom.Location", "Custom.RefID"}),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded Custom",{"RefID", "Grouped", "Custom.Location", "Custom.RefID", "Custom.Number"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Grouped", "Custom.RefID"}),
oldColNames = Table.ColumnNames(#"Removed Columns"),
newColNames = List.ReplaceValue(oldColNames,"Custom.","",Replacer.ReplaceText),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",List.Zip({oldColNames,newColNames}))
in
#"Renamed Columns"
I have a few columns of data, I need to convert the excel version of "PERCENTILE" into Powerquery format.
I have some code which adds in as a function but doesnt apply accurately as it doesnt allow for grouping of the data by CATEGORY and YEAR. So anything that is in Full Discretionary 1.5-2.5 AND 2014 needs to be added to the percentile array, equally anything that falls in Full discretionary 2.5-3.5 AND 2014 needs to go into a different percentile array
let
Source = (list as any, k as number) => let
Source = list,
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Sorted Rows" = Table.Sort(#"Converted to Table",{{"Column1", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "TheIndex", each Table.RowCount(#"Converted to Table")*k/100),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each [Index] >= [TheIndex] and [Index] <= [TheIndex]+1),
Custom1 = List.Average(#"Filtered Rows"[Column1])
in
Custom1
in
Source
So Expected results would be that anything that matches off on the 2 columns (Year,Category) should be applied within the same array. Currently invoking the above function just gives me errors.
I have also tried using grouping and outputting the "Min, Median, and Max" outputs but I also require 10% and 90% Percentiles.
Thank you in advance
Based on some findings on other websites and alot of googling (most folk just want to use DAX but if youre only using Power Query you cant!) someone posted an answer which is very helpful:
https://social.technet.microsoft.com/Forums/en-US/a57bfbea-52d1-4231-b2de-fa993d9bb4c9/can-the-quotpercentilequot-be-calculated-in-power-query?forum=powerquery
Basically:
/PercentileInclusive Function
(inputSeries as list, percentile as number) =>
let
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
Percentile
The above will replicate the PERCENTILE function found within Excel - you pass this as a query using "New Query" and advanced editor. Then call it in after grouping your data -
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
List.Average([Amount Sales]), type number}})
In the above formula, RenamedColumns is the name of the previous step
in the script. Change the name to match your actual case. I've assumed
that the pre-grouping sales amount column is "Amount Sales." Names of
grouped columns are "Sales Total" and "95 Percentile Sales."
Next modify the group formula, substituting List.Average with
PercentileInclusive:
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
PercentileInclusive([Amount Sales],0.95), type number}})
This worked for my data set and matches similar
I'm trying to create a query that sums up a column of values and puts the sum as a new row in the same table. I know I can do this using the group function but it doesn't do it exactly as I need it to do. I'm trying to create an accounting Journal Entry and I need to calculate the offsetting for a long list of debits. I know this is accountant talk. Here's a sample of the table I am using.
Date GL Num GL Name Location Amount
1/31 8000 Payroll Office 7000.00
1/31 8000 Payroll Remote 1750.00
1/31 8000 Payroll City 1800.00
1/31 8010 Taxes Office 600.00
1/31 8010 Taxes Remote 225.00
1/31 8010 Taxes City 240.00
1/31 3000 Accrual All (This needs to be the negative sum of all other rows)
I have been using the Group By functions and grouping by Date with the result being the sum of Amount but that eliminates the previous rows and the four columns except Date. I need to keep all rows and columns, putting the sum in the same Amount column if possible. If the sum has to be in a new column, I can work with that as long as the other columns and rows remain. I also need to enter the GL Num, GL Name, and Location values for this sum row. These three values will not change. They will always be 3000, Accrual, All. The date will change based upon the date used in the actual data. I would prefer to do this all in Power Query (Get & Transform) if possible. I can do it via VBA but I'm trying to make this effortless for others to use.
What you can do it calculate the accrual rows in a separate query and then append them.
Duplicate your query.
Group by Date and sum over Amount. This should return the following:
Date Amount
1/31 11615
Multiply your Amount column by -1. (Transform > Standard > Multiply)
Add custom columns for GL Num, GL Name and Location with the fixed values you choose.
Date Amount GL Num GL Name Location
1/31 11615 3000 Accrual All
Append this table to your original. (Home > Append Queries.)
You can also roll this all up into a single query like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
OriginalTable = Table.TransformColumnTypes(Source,{{"Date", type date}, {"GL Num", Int64.Type}, {"GL Name", type text}, {"Location", type text}, {"Amount", Int64.Type}}),
#"Grouped Rows" = Table.Group(OriginalTable, {"Date"}, {{"Amount", each List.Sum([Amount]), type number}}),
#"Multiplied Column" = Table.TransformColumns(#"Grouped Rows", {{"Amount", each _ * -1, type number}}),
#"Added Custom" = Table.AddColumn(#"Multiplied Column", "GL Num", each 3000),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "GL Name", each "Accrual"),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Location", each "All"),
#"Appended Query" = Table.Combine({OriginalTable, #"Added Custom2"})
in
#"Appended Query"
Note that we are appending the last step with an earlier step in the query instead of referencing a different query.