Count active members between 2 dates PowerPivot DAX - excel

Apologies if I make any errors, first time posting here!
I have a dataset that I've read into the Excel data model using PowerQuery, I've split this into 3 tables that I've linked through a unique ID field (so one main table with just the unique IDs and general info then two tables linked from it).
What I want to do is take one of the linked tables that looks like this:
ID
Start Date
End Date
Category
123456
01/01/2000
01/01/2001
A
I've created a separate date table and what I want is a count of every active ID for each month of the date table which I managed using CALCULATE and FILTER in a column on the date table. But when I load that into the Pivot it ignores the categories.
I tried relating the date table using the start date field of the other table but it didn't make any difference.
I've found tonnes of PowerBI solutions that involve calculated tables but being Excel based is a requirement.
Thanks in advance!

I'm afraid that to expand the date interval in Power Query we need to write a line of M code.
This is a small sample that creates a sample table with the columns of the table in your question. I used different value to keep the example simple.
The idea is to expand the dates interval creating a M List, containing the interval of dates expanded. Then to use this list to create the new rows with the new column "Date".
The last step removes the "Start Date" and "End Date" columns
This code can be directly pasted into the advanced query editor for a new blank query.
let
Source = #table(
type table
[
#"ID"=number,
#"Start Date"=date,
#"End Date"=date,
#"Category"=text
],
{
{1,#date(2020,1,1),#date(2020,1,2), "A"},
{2,#date(2020,1,10),#date(2020,1,12), "A"}
}
),
SourceWithList = Table.AddColumn(Source, "Date",
each List.Dates([Start Date], Duration.Days([End Date] - [Start Date]) + 1, #duration(1, 0, 0, 0))),
#"Expanded DateList" = Table.ExpandListColumn(SourceWithList, "Date"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded DateList",{"Start Date", "End Date"})
in
#"Removed Columns"
The Source statement is just needed for the example, to create the starting table.
The SourceWithList is the M code to be written: it adds a column using the function Table.AddColumn(), and creates the new column using the function List.Dates().
This function requires the start date, the duration and the step interval.
The duration is computed with the function Duration.Days() that returns the difference between two dates as number of days.
To create the #"Expanded DateList" step it's possible to use the Power Query interface clicking on "Expand to new rows" in the column menu. The screenshot I took are in Power BI, but the Power Pivot interface for Power Query is very similar.
Then remove the "Start Date" and "End Date" columns by selecting the column and clicking on "Remove Columns"

Related

Translate Excel Formula to Power Query

In my Power Query I have a column that shows different durations on certain items, but it displays an error when attempting to convert on time or duration.
As a solution next to my Excel Table I created a formula that alows to convert the duration in the format I wish to use, but I have not been able to translate the formula into a language that Power Query can understand (I am pretty new to Power Query).
This is how the data is pulled from source:
But I will like it to show like this:
The Excel Formula I am using to accomplish this is:
=IF(LEN([#Age])=7,"0"&[#Age],IF(LEN([#Age])=5,"00:"&[#Age],IF(LEN([#Age])=4,"00:0"&[#Age],IF(LEN([#Age])=3,"00:00"&[#Age],[#Age]))))
It will be nice to have it in the Power Query instead of the Excel sheet, as it serves as a learning oportunity.
I am self learning Power Query in Excel so any help is welcomed.
EDIT: In Case of the duration being more than 24:00:00, how will i approach it
Here is the error code it returns
You can add a custom column with the formula:
Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
3),
":"))
The formula
Splits the text string by the colon into a List
Replacing blanks with {00} and also prepend the list with a {00} element
Retrieve the last three elements and combine them into a colon separated text string.
Use Duration.FromText function to convert to a duration.
Set the data type of the column to duration
In the PQ Editor, a duration will have the format of d.hh:mm:ss, but when you load it back into Excel, you can change that to [hh]:mm:ss
You can accomplish the above all in the PQ User Interface.
Here is M-Code that does the same thing:
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Duration", each Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
3),
":"))),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Age"})
in
#"Removed Columns"
You can even do it (using M-Code in the Advanced Editor) without adding a column by using the Table.TransformColumns function:
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Change to Duration" = Table.TransformColumns(#"Changed Type",
{"Age", each Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split(_,":"),"","00",Replacer.ReplaceValue),
3),
":")), type duration})
in
#"Change to Duration"
All result in:
Edit
With your modified data, now showing duration values of more than 23 hours (not allowed in a duration literal in PQ), the transformation will be different. We have to check the hours and break it into days and hours if it is more than 23.
Note: the below edit also assumes there will never be anything entered in the day location; and that entries for minutes and seconds will always be within range. If there might be day values, you will need to just add what's there to the "overflow" from the hours entry
So we change the Custom Column formula to check for that:
let
split = List.LastN({"00","00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),4),
s = Number.From(List.Last(split)),
m = Number.From(List.LastN(split,2){0}),
hTotal = Number.From(List.LastN(split,3){0}),
h = Number.Mod(hTotal,24),
d = Number.IntegerDivide(hTotal,24)
in #duration(d,h,m,s)
If you might have illegal values for minutes or seconds, you can add logig to check for that also
Also, if you will be loading this into Excel, and you might have total days >31, you will need to format it (in Excel), as [hh]:mm:ss as with the format d.hh:mm:ss Excel cannot display more than 31 days (although the proper value will be stored in the cell)

How to convert categorical values into columns in Excel?

I am working with a dataset that is structured like the one below. As you can see, the indicator column contains binary categorical data.
country_code indicator cumulative_count
AFG cases 52909
AFG deaths 2230
... ... ...
I would like to turn the indicator column into two separate columns (corresponding with the values of indicator: cases and deaths). I.e. I'm expecting the final result to be like this:
country_code cases deaths
AFG 52909 2230
... ... ...
Notes:
The original dataset is publically accessible from ECDC website.
I am only interested in the cumulative_count of one specific year_week (2020-53).
Here is a screenshot of the dataset:
This can also be accomplished using Power Query, available in Windows Excel 2010+ and Excel 365 (Windows or Mac)
To use Power Query
Load your data table into Excel
Select some cell in your Data Table
Data => Get&Transform => from Table/Range or from within sheet
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
let
//Read in the table
//Change table name in next line to your actual table name
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Remove the unneeded columns
#"Removed Other Columns" = Table.SelectColumns(Source,{"country_code", "indicator", "year_week", "cumulative_count"}),
//Set the data types for those columns
#"Set Data Type" = Table.TransformColumnTypes(#"Removed Other Columns",{
{"country_code", type text}, {"indicator", type text},{"year_week", type text},{"cumulative_count", Int64.Type}
}),
//Pivot the Indicator column and aggregate by Sum
#"Pivoted Column" = Table.Pivot(#"Set Data Type",
List.Distinct(#"Removed Other Columns"[indicator]), "indicator", "cumulative_count", List.Sum),
//Filter to show only the relevant year-week for rows where thiere is a country_code
// (the others refer to continents)
#"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([country_code] <> null) and ([year_week] = "2020-53"))
in
#"Filtered Rows"
filtered to show just 2020-53
If I'm understanding your question correctly. one way:
Add new column F
Formula in $F$2: sumifs($D2:$D$9999, $B2:$B$9999, $B2, $E2:$E$9999, "deaths")
copy formula down through end record
filter column E for "cases"
if you then insert rows above the header row, you can use Subtotal(109, ...) to view cumulative counts for a specific year, or alternatively add another column with Sumif as shown above

How to combine multiple columns from a table

My issue is the following: I have a table where I have multiple columns that have date and values but represent different things. Here is an example for my headers:
I Customer name I Type of Service I Payment 1 date I Payment 1 amount I Payment 2 date I Payment 2 amount I Payment 3 date I Payment 3 amount I Payment 4 date I Payment 4 amount I
What I want to do is sumifs the table based on multiple criteria. For example:
I Type of Service I Month 1 I Month 2 I Month 3 I Month 4
Service 1
Service 2
Service 3
The thing is that I do not want to write 4 sumifs (in this case, but in fact I have more that 4 sets of date:value columns).
I was thinking of creating a new table where I could put all the columns below each other (in one table with 4 columns - Customer name, Type of Service, Date and Payment) but the table should be dynamically created, meaning that it should be expanded dynamically with the new entries in the original table (i.e. if the original table has 200 entries, this would make the new table with 4x200=800 entries, if the original table has one more record then the new table should have 4x201=804 records).
I also checked the PowerQuery option but could not get my head around it.
So any help on the matter will be highly appreciated.
Thank you.
You can certainly create your four column table using Power Query. However, I suspect you may be able to also generate your final report using PQ, so you could add that to this code, if you wish.
And it will update but would require a "Refresh" to do the updating.
The "Refresh" could be triggered by
User selecting the Data/Refresh option
A button on the worksheet which user would have to press.
A VBA event-triggered macro
In any event, in order to make the query adaptable to different numbers of columns requires more M-Code than can be generated from the UI, a well as a custom function.
The algorithm below depends on the data being in this format:
Columns 1 and 2 would be Customer | Type of Service
Remaining columns would alternate between Date | Amount and be Labelled: Payment N Date | Payment N Amount where N is some number
If the real data is not in that format, some changes to the code may be necessary.
To use Power Query:
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
To enter the Custom Function, while in the PQ Editord
Right click in the Queries Pane
Add New Query from Blank Query
Paste the custom function code into the Advanced Editor
rename the Query fnPivotAll
M Code
let
//Change Table name in next line to be the Actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table8"]}[Content],
/*set datatypes dynamically with
first two columns as Text
and subsequent columns alternating as Date and Currency*/
textType = List.Transform(List.FirstN(Table.ColumnNames(Source),2), each {_,Text.Type}),
otherType = List.RemoveFirstN(Table.ColumnNames(Source),2),
dateType = List.Transform(
List.Alternate(otherType,1,1,1), each {_, Date.Type}),
currType = List.Transform(
List.Alternate(otherType,1,1,0), each {_, Currency.Type}),
colTypes = List.Combine({textType, dateType, currType}),
typeIt = Table.TransformColumnTypes(Source,colTypes),
//Unpivot all except first two columns
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(typeIt, List.FirstN(Table.ColumnNames(Source),2), "Attribute", "Value"),
//Remove "Payment n " from attribute column
remPmtN = Table.TransformColumns(#"Unpivoted Other Columns",{{"Attribute", each Text.Split(_," "){2}, Text.Type}}),
//Pivot on the Attribute column without aggregation using Custom Function
pivotAll = fnPivotAll(remPmtN,"Attribute","Value"),
typeIt2 = Table.TransformColumnTypes(pivotAll,{{"date", Date.Type},{"amount", Currency.Type}})
in
typeIt2
Custom Function: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
Sample Data
Output
If this does not give you what you require, or if you have issues going further with it to generate your desired reports, post back.

Excel: how to count combinations of cells over multiple columns?

Example of my data
If I have the data as shown in the picture (real data has same form but is much larger), how would I count how many times a certain combination, for example the combination Dinner - Pasta, occurs per ID? Ideally I would like to make a table in another tab showing per ID the count for all possible combinations.
Thanks in advance!
Try SUMPRODUCT:
=SUMPRODUCT((I2=$A$2:$A$7)*(J2=$B$2:$F$7)*(K2=$C$2:$G$7))
Highlight your entire and Insert - Table
In the table ribbon, change your table name to "InputTable"
In the Get & Transform section of the Data ribbon, click From Table. This will bring up a PowerQuery window. In the PowerQuery window:
Create a new query (Click either Home - Manage - Reference... or Home - New Sources - Other Sources - Blank Query... it doesn't really matter, we just want to create a new query and we're going to replace its contents in the next steps anyway)
Change the name in the (right sidebar) to "ffTableForDay"
Click Home - Advanced Editor
Insert the following code:
// Called "ffTable*" because it's a Function that returns a Function that returns a Table.
// Returns a function specific to a table that takes a day.
// Returned function takes a day and returns a table of meals for that day.
(table as table) as function => (day as text) as table =>
let
#"Type Column Name" = day & "_type",
#"Food Column Name" = day & "_Food",
#"Removed Other Columns" = Table.SelectColumns(table,{"ID", #"Type Column Name", #"Food Column Name"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{#"Type Column Name", "Type"}, {#"Food Column Name", "Food"}}),
#"Removed Blank Rows" = Table.SelectRows(#"Renamed Columns", each [Type] <> null and [Type] <> "" and [Food] <> null and [Food] <> ""),
#"Add Day" = Table.AddColumn(#"Removed Blank Rows", "Day", each day, type text)
in
#"Add Day"
Create a new query
Change the query name to "Meals"
Click Home - Advanced Editor
Insert the following code:
let
Source = InputTable,
Days = {"Monday", "Tuesday", "Wednesday"},
#"Function Per Day" = ffTableForDay(Source),
// get list of tables per call to ffTableForDay(InputTable)(day)
#"Table Per Day" = List.Transform(Days, #"Function Per Day"),
Result = Table.Combine(#"Table Per Day")
in
Result
Create a new query
Change the query name to "ComboCount"
Click Home - Advanced Editor
Insert the following code:
let
Source = Meals,
// Created by clicking **Transform - Group By** and then, in the dialog box, clicking advanced and grouping by Food and Type
#"Grouped Rows" = Table.Group(Source, {"Type", "Food"}, {{"Count", each Table.RowCount(_), type number}})
in
#"Grouped Rows"
Click Home - Close & Load
If your query options were set to load queries to the workbook (default), then delete the "Meals" tab, if you wish. If your query options were to NOT load queries to the workbook by default then right-click on the "ComboCount" query in the side-bar and click "Load To..."
Alternatively
Once we have the "Meals" query working, instead of creating a "ComboCount" query, we could have
loaded Meals to the workbook and done a pivot table, or
loaded Meals to the data model and done a Power Pivot.

PowerQuery COUNTIF Previous Dates

I'm a little rusty on PowerQuery.
I need to count "previous" entries in the same table.
For example, let's say we have a table of car sales.
For the purposes of PowerQuery, this table will be named tblCarSales
I need to add two aggregate columns.
The first aggregate column is the count of previous sales.
The Excel formula would be =COUNTIF([Sale Date],"<"&[#[Sale Date]])
The second aggregate column is the count of previous sales by make.
The Excel formula would be =COUNTIFS([Sale Date],"<"&[#[Sale Date]],[Make],[#Make])
How can this behavior be accomplished in PowerQuery, instead of using Excel formulas?
For example, I'm starting with the source statement:
let
Source = Excel.CurrentWorkbook(){[Name="tblCarSales"]}[Content]
in
Source
... where the source table only provides the Make, Model, and Sale Date columns.
You can do this sort of thing using List and Table functions. I'll show both.
let
Source = Excel.CurrentWorkbook(){[Name="tblCarSales"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Previous Sale Count",
(C) => List.Count(List.Select(Source[Sale Date],
each _ < C[Sale Date]))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Previous Sale Count By Make",
(C) => Table.RowCount(Table.SelectRows(Source,
(S) => S[Sale Date] < C[Sale Date] and S[Make] = C[Make])))
in
#"Added Custom1"
We have to use the functions so that Power Query knows what context we're looking at the columns in. For further reading, check out this Power Query M Primer.

Resources