Aggregation/Summation of text and numeric fields - excel

Raw data
SiteName-----Agency------Staff Numbers
Site1-----------A1------------10
Site1-----------A1------------12
Site1-----------A1------------11
Site1-----------A2-------------5
Wondering how I can get the following in my pivot report;
Site1-------------A1/A2---------33/5
a summation/aggregation of both the "Agency" and "Staff Numbers";
Note that I have successfully aggregated on the "Agency" field (text) using concatenatex with a "/" delimiter, but when it comes to the "Staff Numbers" field (numeric) I am not getting any summation of the staff numbers.
I get;
Site1--------A1/A2----------10/12/11/5--------> this is undesirable
I want;
Site1--------A1/A2----------33/5---------------> this is desirable

CONCATANATEX is a DAX function, isn't it?
In any event, you can do this in Power Query with a nested Table.Group function.
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"SiteName", type text}, {"Agency", type text}, {"Staff Numbers", Int64.Type}}),
//Group by SiteName
#"Grouped Rows" = Table.Group(#"Changed Type", {"SiteName"}, {
//then subGroup by Agency
{"Agency", each Table.Group(_,"Agency", {
{"Staff Numbers", each List.Sum(_[Staff Numbers])}
})}
}),
//Combine the agencies/staff
agencies = Table.AddColumn(#"Grouped Rows", "Agencies", each Text.Combine([Agency][Agency],"/")),
staff = Table.AddColumn(agencies, "Staff Numbers", each
Text.Combine(
List.Transform([Agency][Staff Numbers], each Text.From(_)),
"/")),
//remove unneeded column
#"Removed Columns" = Table.RemoveColumns(staff,{"Agency"})
in
#"Removed Columns"

Related

Merging subsequent cell values only for same ids in excel

I have a requirement here in excel where I have to populate "Facility" column with Country and City names in a single cell for the distributed data in C and D columns but only for the same id.
I have attached the image for reference
Thanks for your time in advance
I tried CONCAT function and TEXTJOIN function but that didn't help
This can be accomplished using Power Query, available in Windows Excel 2010+ and Excel 365 (Windows or Mac)
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Change next line to reflect actual data source
Source = Excel.CurrentWorkbook(){[Name="Table31"]}[Content],
//set data types for each column
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Id", Int64.Type}, {"Country", type text}, {"City", type text}}),
//Group by ID
#"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {
//for each subgroup, group by Country
{"Facility", (t)=> let
grp=Table.Group(t,{"Country"},{
//Then combine all the cities in one text string
"Cities", (tc)=> "(" & Text.Combine(tc[City],",") & ")"}),
//Add index column to "number" the different country/cities combinations
#"Add Index" = Table.AddIndexColumn(grp, "Index",1),
#"Index to Text" = Table.TransformColumns(#"Add Index",{"Index", each Number.ToText(_,"0\. ")}),
//combine the separate subtable columns into one string
#"Combine Columns" = Table.CombineColumns(
#"Index to Text",{"Index","Country","Cities"},Combiner.CombineTextByDelimiter(""),"Facility"),
//combine the separate rows into a single row
#"Combine to One Row" = Text.Combine(#"Combine Columns"[Facility]," ")
in
#"Combine to One Row", type text},
{"All", each _, type table [Id=nullable number, Country=nullable text, City=nullable text]}}),
//Expand the Country and City columns
#"Expanded All" = Table.ExpandTableColumn(#"Grouped Rows", "All", {"Country", "City"})
in
#"Expanded All"

PowerQuery - Forecast from table

I am trying to create a forecast (single table) for departments to input their assumptions on spending in a single table. Instead of entering amounts for every single month, I would like the user to enter the amount, frequency, start date, and end date for each category. To illustrate, see below the table with some sample data.
This is the result in Power Query (or Power BI) I am trying to get, which is my understanding of how to be able to run date slicers and filters in a Power BI model when comparing against actuals.
If this can't be done with DAX and instead must be done in excel (through look up formulas), how would you structure the formula?
Here is a PQ example that creates what you show as your desired table given what you show as your input:
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to better understand the algorithm
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table9"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"G/L", Int64.Type}, {"Dimension", type text}, {"Description", type text},
{"Amount", Int64.Type}, {"Repeat Every", type text}, {"Start Date", type date}, {"End Date", type date}}),
//Last possible date as Today + 5 years (to end of month)
lastDt = Date.EndOfMonth(Date.AddYears(Date.From(DateTime.FixedLocalNow()),5)),
//Generate list of all possible dates for a given row using List.Generate function
allDates = Table.AddColumn(#"Changed Type", "allDates", each let
lastDate = List.Min({lastDt,[End Date]}),
intvl = {1,3,6}{List.PositionOf({"Monthly","Quarterly","Semi Annual"},[Repeat Every])}
in
List.Generate(
()=> [Start Date],
each _ <= lastDate,
each Date.EndOfMonth(Date.AddMonths(_,intvl)))),
//Remove unneeded columns and expand the list of dates
#"Removed Columns" = Table.RemoveColumns(allDates,{"Repeat Every", "Start Date", "End Date"}),
#"Expanded allDates" = Table.ExpandListColumn(#"Removed Columns", "allDates"),
//Sort to get desired output
// Date column MUST be sorted to ensure correct order when pivoted
// Other columns sorted alphanumerically, but could change the sort to reflect original order if preferred.
#"Sorted Rows" = Table.Sort(#"Expanded allDates",{
{"allDates", Order.Ascending},
{"G/L", Order.Ascending},
{"Dimension", Order.Ascending}}),
//Pivot the date column with no aggregation
#"Pivoted Column" = Table.Pivot(
Table.TransformColumnTypes(#"Sorted Rows", {
{"allDates", type text}}, "en-US"),
List.Distinct(Table.TransformColumnTypes(#"Sorted Rows", {{"allDates", type text}}, "en-US")[allDates]),
"allDates", "Amount")
in
#"Pivoted Column"
Original Data
Results

Group by column A value, transpose column B, column C row values for each grouped column A value

This is in Excel 2016. I have a spreadsheet where each row represents a response to two questions "Qa" and "Qb" from a unique student. The spreadsheet columns are: "Section" (class section student is in), "Qa", and "Qb".
Thus, if three students answered from the same class section, that section will be listed three times under "Section", with each unique students answers in the other columns.
I want to group by section and spread the answers to each question across a single row in separate columns. The number of columns to create will default to the section with the most unique responses
In this case, 10003 has the greatest number of responses, so I want to get the following end result.
I am at a loss with how to get this going. Something like grouping by the section but transposing the rows within that group?
As #ScottCraner pointed out, you can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
//Change table name in next row to actual table name in workbook
Source = Excel.CurrentWorkbook(){[Name="Table20"]}[Content],
//set data type
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Section", Int64.Type}, {"Qa", type text}, {"Qb", type text}}),
//Group by Section
//Add a 1-based Index column to each Group
#"Grouped Rows" = Table.Group(#"Changed Type", {"Section"}, {
{"Row", each Table.AddIndexColumn(_,"Row",1,1)}}),
//Expand the grouped tables
#"Expanded Row" = Table.ExpandTableColumn(#"Grouped Rows", "Row", {"Qa", "Qb", "Row"}, {"Qa", "Qb", "Row"}),
//Unpivot
//Merge Row and Attribute columns to create the q-number headers
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Expanded Row", {"Section", "Row"}, "Attribute", "Value"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Unpivoted Other Columns",
{{"Row", type text}}, "en-US"),{"Attribute", "Row"},
Combiner.CombineTextByDelimiter("-", QuoteStyle.None),"Merged"),
//Pivot on the Sorted Merged column with no aggregation
#"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Sort(List.Distinct(#"Merged Columns"[Merged])), "Merged", "Value")
in
#"Pivoted Column"
Note that there are no empty columns (iow, there is no Qa-4)
If you really need an empty column, insert a step at the beginning replacing nulls with a blank
let
//Change table name in next row to actual table name in workbook
Source = Excel.CurrentWorkbook(){[Name="Table20"]}[Content],
//set data type
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Section", Int64.Type}, {"Qa", type text}, {"Qb", type text}}),
//if you really need a blank Qa column since you have four distinct Qb rows but only 3 Qa rows,
// then we insert the next line
#"Replaced Value" = Table.ReplaceValue(#"Changed Type",null,"",Replacer.ReplaceValue,{"Qa", "Qb"}),
//Group by Section
//Add a 1-based Index column to each Group
#"Grouped Rows" = Table.Group(#"Replaced Value", {"Section"}, {
{"Row", each Table.AddIndexColumn(_,"Row",1,1)}}),
//Expand the grouped tables
#"Expanded Row" = Table.ExpandTableColumn(#"Grouped Rows", "Row", {"Qa", "Qb", "Row"}, {"Qa", "Qb", "Row"}),
//Unpivot
//Merge Row and Attribute columns to create the q-number headers
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Expanded Row", {"Section", "Row"}, "Attribute", "Value"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Unpivoted Other Columns",
{{"Row", type text}}, "en-US"),{"Attribute", "Row"},
Combiner.CombineTextByDelimiter("-", QuoteStyle.None),"Merged"),
//Pivot on the Sorted Merged column with no aggregation
#"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Sort(List.Distinct(#"Merged Columns"[Merged])), "Merged", "Value")
in
#"Pivoted Column"

Transpose repeating data from rows into columns Excel

I have data set of basic housing data in the following format:
Existing data format:
That format is the same and reapeats for hundrets of properties. I would like to transform that that into a table format like the following example:
Property Type
Price
Location
Region
Additional info
Area
House
252000
London
Kensington
4500 square meters
...
...
...
...
...
etc
In other words I want to make the text before ":" symbol column name with the text after it the data that goes into into the corresponding cell and to repeat that for hundrets of sites. Usually there is missing(no data) in Additional info but sometimes there is.
I am not shure which is the best program to do this. So far in my mind comes Excel but if there is an easier way I will be glad to use it.
As per my below screenshot Excel 365 I have used following formulas.
C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,1,4)),": ","</s><s>")&"</s></t>","//s[last()]")
D2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,2,4)),": ","</s><s>")&"</s></t>","//s[last()]")
E2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[2]")
F2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[last()-1]")
H2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,4,4)),": ","</s><s>")&"</s></t>","//s[last()]")
If you are not in Excel 365 then can try-
=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3),": ","</s><s>")&"</s></t>","//s[last()]")
Basically =ROW(A1)+(ROW(A1)-1)*3 will generate a sequence of row numbers and INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3) will return value from Column A as per that sequence. Then FILTERXML() will return expected value specified in xPath parameter.
To know, how FILTERXML() works yo can read this article from JvdV. This is a fantastic article for FILTERXML() lover.
You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
Note: The fnPivotAll function is a custom function that enables a method of creating a non-aggregated Pivot Table where there are multiple values per Pivot Column. From the UI, you add this as a New Query from Blank, and just paste that M-code in place of what's there
M-Code (for main query)
let
//Read in data
//Change table name in next line to your actural table name
Source = Excel.CurrentWorkbook(){[Name="Table1_2"]}[Content],
//Split by comma into new rows
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1",
Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),
let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
//Remove the blank rows
#"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Column1] <> "" and [Column1] <> " ")),
//Split by the rightmost colon only into new columns
#"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Column1",
Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
//Split by the remaining colon into new rows
// So as to have empty rows under "Additional data"
//Then Trim the columns to remove leading/trailing spaces
#"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter1", {{"Column1.1", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1.1"),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Column1.1", type text}, {"Column1.2", type text}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
//Create new column processing "Additional Data" to show a blank
// and Price to just show the numeric value, splitting from "EUR"
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each if [Column1.1] = "Additional data" then " "
else if [Column1.1] = "Price" then Text.Split([Column1.2]," "){1} else [Column1.2]),
//Remove unneeded column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1.2"}),
//non-aggregated pivot
pivot = fnPivotAll(#"Removed Columns","Column1.1","Custom"),
//set data types (frequently a good idea in PQ
#"Changed Type1" = Table.TransformColumnTypes(pivot,{
{"Property type", type text},
{"Location", type text},
{"region", type text},
{"Additional data", type text},
{"Area", type text},
{"Price", Currency.Type}})
in
#"Changed Type1"
M-Code (for custom function)
be sure to rename this query: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"

Scan worksheet for missing items per ID

I have a worksheet with IDs of people visiting on certain days.
Simple example.
I want to scan all IDs to check if they have missed a visit day. When visit day 1, 2, 3, 4 and 5 are obligated.
I can't add code to this database, because it is locked (it is a worksheet with confidential info).
I don't know where to start.
The following solution is using Power Query which is available in Excel 2010 Professional Plus and all later versions of Excel. My demonstration is using Excel 365.
Suppose you have two tables:
Table1 is called Tbl_Visitday which is the 2-Column table in your example;
Table2 is called Rng_Obligated which is a 1-Column table containing all obligated days.
Go to Data tab in your Excel ribbon, use From Table function to add both tables to the power query editor one by one. When you access the editor for the first time, make sure set up the Query Options as below to avoid loading every query to a new worksheet;
Once you have added both tables to the editor, make a duplicate of Tbl_Visitday in the Queries section on the left hand side as shown below:
Let's work on Rng_Obligated first, highlight the column, use Transpose function under the Transform tab to transpose the data from rows to columns, then use Merge Columns function to merge all columns by delimiter semicolon ;, then you should have something like the following:
Let's move to Tbl_Obligated(2), remove the Visitday column, remove duplicates within the ID column and sort it ascending, then you should have:
Use Append Queries function under the Home tab to append Rng_Obligated table to the current table, and then right click the Merged column header and choose Fill -> Up to quickly fill the merged column with the same string, then you should have something like follow:
Filtered the ID column to hide null, then use the Split Columns function under the Transform tab to split the Merged column by delimiter semicolon ;, and in the advanced options to choose to put the results into Rows as shown below:
Use Merge Queries function under the Home tab to merge Tbl_Visitday table with the current table by holding the Ctrl key and select the first and second column consecutively in each table as shown below:
Expand the newly merge column to show Visitday column only, add a custom column using this formula =[Merged]=[Visitday], then filter the Custom column to show FALSE results only, then you should have:
Change the format of the Merged column to Text, then use Group By function under the Transform tab to group the Merged column by ID as shown below, the result will be error which is expected:
Go back to the last step in the APPLIED STEPS section on the right hand side, go to the formula bar and replace this part of the formula List.Sum([Merged]) with Text.Combine([Merged],","), hit enter and you will notice the error have become a text string as shown below:
You can close and load the query which will be created as a connection if you have amended the query setting in the first step. You can click Queries & Connections under the Data tab and right click the query and choose to load it to a specific location in your workbook.
In your case, you will need to ask the owner of the shared workbook to unlock the workbook so you can use the power query editor and load the output. Alternatively you can copy and paste the data to a new workbook where you can execute the power query to obtain the result.
Power Query allows you to update your source tables and recalculate the output (once you choose to refresh the data) in the back-end normally in a few seconds. If you do not want the output to be refreshed, you can copy and paste the output to a new table so the results stay unchanged.
Here are the power query M Code for the two tables for your reference. Let me know if you have any questions. Cheers :)
Rng_Obligated
let
Source = Excel.CurrentWorkbook(){[Name="Rng_Obligated"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"obligated", Int64.Type}}),
#"Transposed Table" = Table.Transpose(#"Changed Type"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Transposed Table", {{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}}, "en-AU"),{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7"},Combiner.CombineTextByDelimiter(";", QuoteStyle.None),"Merged")
in
#"Merged Columns"
Tbl_Visitday(2)
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_Visitday"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Visitday", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Visitday"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"ID", Order.Ascending}}),
#"Appended Query" = Table.Combine({#"Sorted Rows", Rng_Obligated}),
#"Filled Up" = Table.FillUp(#"Appended Query",{"Merged"}),
#"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([ID] <> null)),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows", {{"Merged", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type1",{"ID", "Merged"},Tbl_Visitday,{"ID", "Visitday"},"Table6",JoinKind.LeftOuter),
#"Expanded Table6" = Table.ExpandTableColumn(#"Merged Queries", "Table6", {"Visitday"}, {"Visitday"}),
#"Added Custom" = Table.AddColumn(#"Expanded Table6", "Custom", each [Merged]=[Visitday]),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([Custom] = false)),
#"Changed Type2" = Table.TransformColumnTypes(#"Filtered Rows1",{{"Merged", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type2", {"ID"}, {{"MissedDay", each Text.Combine([Merged],","), type text}})
in
#"Grouped Rows"

Resources