Pivoting multiple columns using PowerQuery in Excel - excel

I'm trying to pivot multiple columns of the following table:
The result I would like to get is the following:
I'm using PowerQuery in Excel, but I couldn't manage to pivot multiple columns (i.e., I can pivot the column "Number", for example). Anyone has any insight about the correct usage of PowerQuery?

Here is an answer to first version of your question
let
src = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
lettersABC=List.Distinct(src[Attribute1]),
count=List.Count(lettersABC),
lettersNUM=List.Transform({1..count}, each "Letter"&Number.ToText(_)),
numbersNUM=List.Transform({1..count}, each "Number"&Number.ToText(_)),
group = Table.Group(src, {"ID"}, {{"attr", each Record.FromList(lettersABC&[Attribute2], lettersNUM&[Attribute1])}}),
exp = Table.ExpandRecordColumn(group, "attr", lettersNUM&lettersABC, lettersNUM&numbersNUM)
in
exp

For example, if the country header is in cell A1 then this formula in D2:
= "tax rate" & CountIf( $A$2:$A2, $A2 )
then copy the formula cell D2 and paste it in the cells below it should give you something like:
country tax rate Income thresholds count
UK 20% 35k tax rate1
UK 30% 50k tax rate2
.....
Now you can pivot by that extra count column with PivotTable or PowerQuery. You can use the same formula for the Income th1, Income th2, etc columns.

Here's a solution using the PQ ribbon, but note the last step (Group By) is not dynamic e.g. you would have to change it if you wanted 4+4 columns per country.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"country", type text}, {"tax rate", type number}, {"Income thresholds", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Grouped Rows" = Table.Group(#"Added Index", {"country"}, {{"Min Index", each List.Min([Index]), type number}, {"All Rows", each _, type table}}),
#"Expanded All Rows" = Table.ExpandTableColumn(#"Grouped Rows", "All Rows", {"Income thresholds", "Index", "tax rate"}, {"Income thresholds", "Index", "tax rate"}),
#"Added Custom" = Table.AddColumn(#"Expanded All Rows", "Column Index", each [Index] - [Min Index] + 1),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Min Index", "Index"}),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Columns", "Column Index", "Column Index - Copy"),
#"Added Prefix" = Table.TransformColumns(#"Duplicated Column", {{"Column Index", each "tax rate" & Text.From(_, "en-AU"), type text}}),
#"Pivoted Column" = Table.Pivot(#"Added Prefix", List.Distinct(#"Added Prefix"[#"Column Index"]), "Column Index", "tax rate", List.Max),
#"Added Prefix1" = Table.TransformColumns(#"Pivoted Column", {{"Column Index - Copy", each "Income thresholds" & Text.From(_, "en-AU"), type text}}),
#"Pivoted Column1" = Table.Pivot(#"Added Prefix1", List.Distinct(#"Added Prefix1"[#"Column Index - Copy"]), "Column Index - Copy", "Income thresholds", List.Max),
#"Grouped Rows1" = Table.Group(#"Pivoted Column1", {"country"}, {{"tax rate1", each List.Max([tax rate1]), type number}, {"tax rate2", each List.Max([tax rate2]), type number}, {"tax rate3", each List.Max([tax rate3]), type number}, {"Income th1", each List.Max([Income thresholds1]), type text}, {"Income th2", each List.Max([Income thresholds2]), type text}, {"Income th3", each List.Max([Income thresholds3]), type text}})
in
#"Grouped Rows1"

Related

Column rows to column headers in Power Query

I have a data set where I would like to have a table with Unique IDs in one Column A and from Column B the rows from a "Input" table above with a different rows as a column headers. In Column A are IDs (unique) and Column B has different rows that have to be in columns but matching values on the rest of the columns.. see on screenshot.
Third Column C is a just observational column that gives info what kind of data type should be there (it can be avoided in this case).
I though I was going to solve it "easily" with Pivot/Unpivot+Transponse method in Power Query but no way....I can get it in one row like in "Output" table..
The dummy data is in link below.
https://docs.google.com/spreadsheets/d/1qKeVj9nJF1usBk-OUZPJfpRqSQnOTCvr/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
Merge the value columns into a single column before the pivot, eg
let
Source = Excel.Workbook(File.Contents("C:\Users\david\Downloads\Test1.xlsx"), null, false),
Sheet1_sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
FilterNullAndWhitespace = each List.Select(_, each _ <> null and (not (_ is text) or Text.Trim(_) <> "")),
#"Added Custom" = Table.AddColumn(Sheet1_sheet, "IsEmptyRow", each try List.IsEmpty(FilterNullAndWhitespace(Record.FieldValues(_))) otherwise false),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", -1),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Section", each if [IsEmptyRow] then -1 else if try #"Added Index"[IsEmptyRow]{[Index]} otherwise true then [Index] else null),
#"Removed Blank Rows" = Table.SelectRows(#"Added Custom1", each not [IsEmptyRow]),
#"Filled Down" = Table.FillDown(#"Removed Blank Rows", {"Section"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"Section"}, {{"Rows", each _}}, GroupKind.Local),
#"Selected Group" = #"Grouped Rows"[Rows]{1},
#"Removed Columns" = Table.RemoveColumns(#"Selected Group", {"IsEmptyRow", "Index", "Section"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Columns", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ObjectID", Int64.Type}, {"Feld", type text}, {"Datentyp", type text}, {"boolValue", type text}, {"dateValue", type date}, {"intValue", Int64.Type}, {"stringValue", type text}, {"longStringValue", type text}, {"referencedObjectId", Int64.Type}}),
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Merged", each Text.Combine({[boolValue], Text.From([dateValue], "en-US"), Text.From([intValue], "en-US"), [stringValue], [longStringValue], Text.From([referencedObjectId], "en-US")}, ""), type text),
#"Removed Columns1" = Table.RemoveColumns(#"Inserted Merged Column",{"Datentyp", "boolValue", "dateValue", "intValue", "stringValue", "longStringValue", "referencedObjectId"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns1", List.Distinct(#"Removed Columns1"[Feld]), "Feld", "Merged")
in
#"Pivoted Column"

Countifs & Sumifs in excel power query gives wrong output for large dataset

I've a table which contains shop, shelf and products & a result table which has countifs & sumifs formula embedded. Below is the screenshot attached of table and result table.
countifs formula
=COUNTIFS(F6:F17,J6:J17,G6:G17,"p1")
sumifs formula
=SUMIFS(K:K,I:I,I6)
I'm trying to use this formula in excel power query. Below is the screenshot of powerquery and custom column formula for countifs
countifs formula in powerquery
= List.Count(
Table.SelectRows(
Table3,
(Var) => (Var)[shelf]=[result_shelf]
and
(Var)[product]="p1"
)
[product]
)
I'm getting the result in powerquery for countifs for small dataset but I've something around 5k lines where this powerquery formula doesn't work and only shows 0 .
Powerquery for sumifs : I'm unable to achieve the formula
Probably about 5 ways but how about
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"shop", type text}, {"shelf", type text}, {"product", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "countifs", each if [product]="p1" then 1 else 0),
#"Grouped Rows" = Table.Group(#"Added Custom", {"shop"}, {{"sumifs", each List.Sum([countifs]), type number}, {"data", each _, type table [shop=nullable text, shelf=nullable text, product=nullable text, countifs=number]}}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"shelf", "product", "countifs"}, {"shelf", "product", "countifs"})
in #"Expanded data"
or
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"shop", type text}, {"shelf", type text}, {"product", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"shop"}, {
{"data", each Table.AddColumn(_, "countifs", each if [product]="p1" then 1 else 0), type table },
{"sumifs", each Table.RowCount(Table.SelectRows(_, each [product] = "p1")),type number }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"shelf", "product", "countifs"}, {"shelf", "product", "countifs"})
in #"Expanded data"
or
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"shop", type text}, {"shelf", type text}, {"product", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "countifs", each if [product]="p1" then 1 else 0),
#"Added Custom2" = Table.AddColumn(#"Added Custom","sumifs",(i)=>Table.RowCount(Table.SelectRows(#"Added Custom", each [shop]=i[shop] and [product]="p1")), type number)
in #"Added Custom2"

excel convert data into multi rows from single row

I have below sample insurance data contains family details in single row for each ID.
ID Enrollment date Area Full Name Gender DOB Sum Insured Spouse Name Gender DOB Kid1_Name Gender DOB Kid2_Name Gender DOB
29348 24-01-2021 17 NAINAR M Male 17-Mar-1982 500000 SUBBULAKSHMI FEMALE 21-Jun-1988 GOKULSRIRAM MALE 31-Oct-2007 SRIDHAR MALE 19-Feb-2009
23434 19-04-2020 17 Kishore Male 12-Jun-1986 200000 A Savitha Female 10-Jun-1991 Sathvik Male 4-Mar-2014 A Saketh male 13-Feb-2015
46565 01-05-2020 5 Ragu Male 6-Aug-1996 300000
I'm trying to convert data like below format, so that family details are shown in rows
Tried using PivotTable option and power query option in excel but no luck.
Is it possible in excel ?
Thanks
Here's one kludgy way to do it in powerquery
(a) Merge groups of columns together (b) unpivot (c) split those columns again
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Self", each "Self"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Spouse", each "Spouse"),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Child1", each "Child1"),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "Child2", each "Child2"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Added Custom3", {{"DOB", type text}, {"Sum Insured", type text}}, "en-US"),{"Full Name", "Gender", "DOB", "Self","Sum Insured"},Combiner.CombineTextByDelimiter("::", QuoteStyle.None),"m1"),
#"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Merged Columns", {{"DOB3", type text}}, "en-US"),{"Spouse Name", "Gender2", "DOB3", "Spouse"},Combiner.CombineTextByDelimiter("::", QuoteStyle.None),"m2"),
#"Merged Columns2" = Table.CombineColumns(Table.TransformColumnTypes(#"Merged Columns1", {{"DOB5", type text}}, "en-US"),{"Kid1_Name", "Gender4", "DOB5", "Child1"},Combiner.CombineTextByDelimiter("::", QuoteStyle.None),"m3"),
#"Merged Columns3" = Table.CombineColumns(Table.TransformColumnTypes(#"Merged Columns2", {{"DOB7", type text}}, "en-US"),{"Kid2_Name", "Gender6", "DOB7", "Child2"},Combiner.CombineTextByDelimiter("::", QuoteStyle.None),"m4"),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Merged Columns3", { "ID", "Enrollment date", "Area"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Value", Splitter.SplitTextByDelimiter("::", QuoteStyle.Csv), {"Name", "Gender", "Date of Birth", "Relation", "Insured amount"}),
#"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Name] <> "")),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows",{{"Date of Birth", type datetime}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Changed Type",{{"Date of Birth", type date}, {"Enrollment date", type date}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Attribute"})
in #"Removed Columns"
Note if you load multiple columns with same column headers into powerquery, then the titles will change to have numbers after them. You probably will have to update code to fix the column names for Date Birth and Gender
Here is another power query method.
Original Data
Read the code comments and explore the Applied steps to get a better idea of the algorithm.
Select the ID column and Unpivot other columns
Group by ID
Create a custom aggregation that creates a List of records for each family
Expand the records into a table
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Unpivot all except ID column
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID"}, "Attribute", "Value"),
//Group by ID then custom aggregation
//Column Names for final report
colNames = {"Date of Enrollment", "Area", "Relation", "Name", "Gender", "Date of Birth", "Sum Insured"},
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"ID"}, {
{"Records", (t)=>
List.Generate(
()=>[ed=t[Value]{0}, a=t[Value]{1}, r="Self", n=t[Value]{2}, g=t[Value]{3}, dob=t[Value]{4}, si=t[Value]{5}, idx=5],
each [idx] < Table.RowCount(t),
each [ed=null, a=null, r=Text.SplitAny(t[Attribute]{[idx]+1}," _"){0},
n=t[Value]{[idx]+1}, g=t[Value]{[idx]+2}, dob=t[Value]{[idx]+3}, si=null, idx=[idx]+3],
each Record.FromList(
{[ed],[a],[r],[n],[g],[dob],[si]},
colNames)
)}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"ID"}),
#"Expanded Records" = Table.ExpandListColumn(#"Removed Columns", "Records"),
#"Expanded Records1" = Table.ExpandRecordColumn(#"Expanded Records", "Records",
colNames,colNames),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Records1",{
{"Date of Enrollment", type date}, {"Area", Int64.Type}, {"Relation", type text}, {"Name", type text},
{"Gender", type text}, {"Date of Birth", type date}, {"Sum Insured", Currency.Type}})
in
#"Changed Type1"
Results

powerquery filter TopN values for each year

i have a data set for each country and each year.
question : i want to filter Gross Amount topN for each year with PowerQuery.
i can get a result with this code only for one year but i need all years' data with top10(N) in one list.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Export Country", type text}, {"Gross Export", Int64.Type}, {"Share", type number}, {"Year", Int64.Type}, {"Imp/Exp", type text}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Year", Order.Ascending}, {"Gross Export", Order.Descending}}),
#"Kept First Rows" = Table.FirstN(#"Sorted Rows",10)
in #"Kept First Rows"
Workaround : I created lists for each year separately and then merge them but it is a long shoot. Expected result in the sheet "Export_Top10"
Thank you for your help.
Data File
Here's another method that also involves Grouping.
But, instead of Sorting all the entire subtable and adding an Index column, I sort a List of the Gross Exports and Select only those rows where Gross Exports is >= than the tenth highest.
Note that this method will return all rows in the event of a tie. So if two countries are tied for exports in a given year, you might have 11 rows returned instead of 10.
let
Source = #"Table1 (3)",
//Group by year and extract top 10
#"Grouped Rows" = Table.Group(Source, {"Year"}, {
{"Top Ten", (t)=> Table.SelectRows(t, each [Gross Export]> List.Sort(t[Gross Export],Order.Descending){10})}}),
//remove year column since we will expand it in the correct order in next step
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Year"}),
//expand the top ten table
#"Expanded Top Ten" = Table.ExpandTableColumn(#"Removed Columns", "Top Ten",
{"Export Country", "Gross Export", "Share", "Year", "Imp/Exp"}, {"Export Country", "Gross Export", "Share", "Year", "Imp/Exp"})
in
#"Expanded Top Ten"
example results showing only top 3
Below is an example of grouping on [Year], and then within each group, sorting on [Amount] and adding an index. Then you expand the data and filter the index for the top X [Amount] numbers you are looking for. In the code I pick up the top 10, in the image, the top 3
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Year", Int64.Type}, {"Amount", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Year"}, {{"All", each Table.AddIndexColumn(Table.Sort(_,{{"Amount", Order.Descending}}),"Index",1,1), type table}}),
#"Expanded All" = Table.ExpandTableColumn(#"Grouped Rows", "All", {"Amount", "Index"}, {"Amount", "Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded All", each ([Index] < 11))
in #"Filtered Rows"
Sample before/after picking top 3:
with the help of #horseyride i figured out the solution
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Export Country", type text}, {"Gross Export", Int64.Type}, {"Share", type number}, {"Year", Int64.Type}, {"Imp/Exp", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Year"}, {{"All", each Table.AddIndexColumn(Table.Sort(_,{{"Gross Export", Order.Descending}}),"Index",1,1), type table}}),
#"Expanded All" = Table.ExpandTableColumn(#"Grouped Rows", "All", {"Export Country", "Gross Export", "Share", "Imp/Exp", "Index"}, {"Export Country", "Gross Export", "Share", "Imp/Exp", "Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded All",{{"Gross Export", type number}, {"Share", Percentage.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type1", each [Index] <= 10),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
#"Removed Columns"

How to sum N columns in Power Query

My data gets updated every month so I'm trying to create a power query table that would show the sum of the pivoted (N) columns that I created but I can't seem to figure out how to do it in power query.
I have this code currently:
After Pivoting:
Create a list of the columns to sum
Add an Index column to restrict to each row
Add a column which Sums the columns for just that row
Remove the Index colum
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Month Yr", Date.Type}, {"Attribute", type text}, {"Value", Currency.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "MonthYear", each Date.ToText([Month Yr],"MMMM yyyy")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Month Yr"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[MonthYear]), "MonthYear", "Value", List.Sum),
//NEW code added after your Pivoted Column line
//Get List of columns to sum
// Assumes this list all columns **except the first** in the Pivot table
// There are other methods of generating this list if this assumption is incorrect
colToSum = List.RemoveFirstN(Table.ColumnNames(#"Pivoted Column"),1),
//Add Index Column
IDX = Table.AddIndexColumn(#"Pivoted Column","Index",0,1),
//Sum each row of "colToSum"
totals = Table.AddColumn(IDX, "Sum", each List.Sum(
Record.ToList(
Table.SelectColumns(IDX,colToSum){[Index]})
), Currency.Type),
#"Removed Columns1" = Table.RemoveColumns(totals,{"Index"})
in
#"Removed Columns1"
You can group and then merge into the table after pivoting
#"Grouped Rows" = Table.Group(#"Changed Type", {"Atribute"}, {{"Sum", each List.Sum([Value]), type number}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Changed Type", {{"Month Year", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Changed Type", {{"Month Year", type text}}, "en-US")[#"Month Year"]), "Month Year", "Value", List.Sum),
#"Merged Queries" = Table.NestedJoin(#"Pivoted Column",{"Atribute"}, #"Grouped Rows",{"Atribute"},"Table2",JoinKind.LeftOuter),
#"Expanded Table" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Sum"}, {"Sum"})
in #"Expanded Table"
Or you can group, add it to the table, then pivot the combined new set
#"Grouped Rows" = Table.Group(#"Changed Type", {"Atribute"}, {{"Value", each List.Sum([Value]), type number}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Month Year", each "Sum"),
#"Reordered Columns" = Table.ReorderColumns(#"Added Custom",{"Month Year", "Atribute", "Value"}),
combined = #"Reordered Columns" & #"Changed Type",
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(combined, {{"Month Year", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(combined, {{"Month Year", type text}}, "en-US")[#"Month Year"]), "Month Year", "Value", List.Sum)
in #"Pivoted Column"

Resources