Power Query Transform Data with Unique Columns - excel-formula

Source Data Table.
Desired Output
We want to Extract all the Unique values from Each Type Columns and Pivot the unique values as Column headers.
Somewhat similar to this but we have more then one columns to look up unique values.
Power Query - Transpose unique values and get matching values in rows
Number of Type columns in the Source table can increase or decrease over time.

The code below is created via standard menu options. This video takes you through the results of each step.
let
Source = SourceData,
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Key"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Type", each if Text.Start([Attribute],4) = "Type" then [Value] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Type"}),
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each not Text.StartsWith([Attribute], "Type")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Attribute"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Type]), "Type", "Value")
in
#"Pivoted Column"

Related

Excel (or) Power BI, Rolling Sum

Is there any way in Excel Pivot or Power BI to do the rolling sum of the given data (let say monthly)?
Let say I have a list of cases, each row represent case count and amount. The project start date and end date varied as follows.
For, simplicity, if I demonstrate the data graphically, would be as follows.
What I'm try to do is to aggregate how much case counts and amounts in total for each chunk of month.
My goal is to produce below list using Pivot (if Pivot is not possible, then by Power Query) directly.
I could produce monthly aggregates using Filter function and Sum, then pivot that data to produce above result.
If there is a direct way of producing that aggregates in one step, that would be better. Please suggest it for me.
Please see sample data in below link
https://docs.google.com/spreadsheets/d/1vAKElb2-V_If-MMlPwHk_VGhYr8pkOg_gQfRYRrkbtc/edit?usp=share_link
Excel file in Zip
https://drive.google.com/file/d/1QqgNUrJlBuvin7iecsxsvexrGZXFIt-g/view?usp=share_link
Thank you in advance
LuZ
You can load the data into powerquery and transform from left to data table on right
code for that is
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom1" = Table.AddColumn(Source, "Date", each List.Generate(()=>[x=[Start Date],i=0], each [i]<12, each [i=[i]+1,x=Date.AddMonths([x],1)], each [x])),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom1", "Date"),
#"Added Custom" = Table.AddColumn(#"Expanded Custom", "Year", each Date.Year([Date])),
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Month", each Date.Month([Date])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Start Date", "End Date", "Date"})
in #"Removed Columns"
Afterwards, load the powerquery back into excel as pivot report and generate your table
Alternatively, just use use
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom1" = Table.AddColumn(Source, "Date", each List.Generate(()=>[x=[Start Date],i=0], each [i]<12, each [i=[i]+1,x=Date.AddMonths([x],1)], each [x])),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom1", "Date"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Start Date", "End Date"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Date"}, {{"Amount", each List.Sum([Amount]), type number}, {"Case Count", each List.Sum([Case Count]), type number}}),
#"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"Date", type date}, {"Amount", type number}, {"Case Count", type number}})
in #"Changed Type"
to generate this table, then graph it

Pivot columns with multiple instance (rows) of attribute

I've searched far and wide and haven't found an answer to this specific case, and wasn't able to adapt some of these solutions.
First of all, my data is a long list of attributes and their values for every product, structured like this:
Structured Initial Data
Note that some products have a single value per attributes, but (and here's my problem) some products have different values for the same attribute.
When I pivot the table in PowerQuery, i get errors where the products have multiple instances of the same attributes.
The resulting table that i'm looking for would be structured like this:
Structured Final Data
Thank you for your help!
See if this works for you
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Sorted Rows" = Table.Sort(Source,{{"Products", Order.Ascending}, {"Attributes", Order.Ascending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Products"}, {{"data", each _, type table}}),
#"Added Index1" = Table.AddIndexColumn(#"Grouped Rows", "Index", 0, 1),
#"Expanded data" = Table.ExpandTableColumn(#"Added Index1", "data", {"Attributes", "Values"}, {"Attributes", "Values"}),
mGroup = Table.Group(#"Expanded data" , {"Attributes","Products"}, {{"GRP", each Table.AddIndexColumn(_, "Index2", 1, 1), type table}}),
#"Expanded GRP" = Table.ExpandTableColumn(mGroup, "GRP", {"Values", "Index", "Index2"}, {"Values", "Index", "Index2"}),
#"Added Custom" = Table.AddColumn(#"Expanded GRP", "Row#", each [Index]+[Index2]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index", "Index2"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attributes]), "Attributes", "Values"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Row#"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns1",{"Products", "Each", "Pack"})
in #"Reordered Columns"
It groups on product and adds an index. Then it groups on product and Attribute and adds another index. The sum of those two are a unique row number you can use for pivoting

Automatically reorganizing large Excel table to separate each column into multiple columns based on grouping variables

I have an Excel table with data organized such that each row is a sample and each column has a different property of that sample. However, I need to reorganize it so that it works with GraphPad Prism.
Currently the data is organized like this:
Sample ID
Exposure Level
Drug
Score 1
…
Score 22
101
1
A
0.675815
0.17351
102
1
B
0.276413
0.677079
103
2
A
0.914725
0.387529
104
3
A
0.504221
0.135295
105
3
B
0.963684
0.710081
106
2
B
0.964099
0.146872
And I want to make a box and whisker plot showing the score of each exposure level, like this:
I need to do this including all the samples and then again for just drug A and just drug B.
However, in order to do that in Prism, to my knowledge, each combination of variables you want needs to have in own column, like this:
Score 1 Exposure 1
Score 1 Exposure 2
Score 1 Exposure 3
Score 1 Exposure 1 (Just Drug A)
Score 1 Exposure 2 (Just Drug A)
Score 1 Exposure 3 (Just Drug A)
etc.
0.675815
0.914725
0.504221
0.675815
0.914725
0.504221
0.276413
0.964099
0.963684
This would be easy enough to do manually if there were just one score column, but there are twenty-two, so I'd rather not. Is there some automated way I can reorganize the data table like this?
To create a Box & Whiskers graph similar to what you show,
merely use the Exposure Level for the x-axis and the Score 1 column for the y-axis
To create a table similar to the results you show, you can use Power Query.
I created it as a single table, with each row representing a drug. You can then filter it by drug for your drug specific results.
The MCode is commented so by reading the comments, and also looking at the Applied Steps window, I hope I was clear in what was going on.
Most of the MCode is generated from the UI, but, especially, the colNames and ExpandTableColumns steps near the end are manually entered. Otherwise the number of columns in the expansion would not be flexible.
MCode
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Won't need ID column so get rid of it
#"Removed Columns2" = Table.RemoveColumns(Source,{"Sample ID"}),
//Unpivot the Score columns to put them in a single column
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Columns2", {"Exposure Level", "Drug"}, "Attribute", "Value"),
//sort by Score, Attribute, Drug so the results will be properly ordered
#"Sorted Rows" = Table.Sort(#"Unpivoted Columns",{{"Attribute", Order.Ascending}, {"Exposure Level", Order.Ascending}, {"Drug", Order.Ascending}}),
//Create what will become a two line header column
// and remove the originals
#"Added Custom" = Table.AddColumn(#"Sorted Rows", "Headers", each "Exposure " & Text.From([Exposure Level]) & "#(lf)" & [Attribute]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Exposure Level", "Attribute"}),
//Move headers to first column
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Headers", "Drug", "Value"}),
//Group by Drug
#"Grouped Rows" = Table.Group(#"Reordered Columns", {"Drug"}, {{"Grouped", each _, type table [Headers=text, Drug=text, Value=number]}}),
//Add an Index column
#"Added Index" = Table.AddIndexColumn(#"Grouped Rows", "Index", 0, 1, Int64.Type),
/*From each grouped table, remove Drug Column
and remove Header column EXCEPT fromk the first table
then Transpose each grouped table*/
#"Added Custom1" = Table.AddColumn(#"Added Index", "Custom", each
Table.Transpose(
if [Index] = 0 then
Table.RemoveColumns([Grouped],"Drug")
else
Table.RemoveColumns([Grouped],{"Headers","Drug"}))),
//Remove no longer needed Grouped and Index columns
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Grouped", "Index"}),
//Expand the table columns, promote headers, and rename the drug column to get final results
colNames = Table.ColumnNames(#"Removed Columns1"[Custom]{0}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns1", "Custom", colNames),
#"Promoted Headers" = Table.PromoteHeaders(#"Expanded Custom", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"A", type text}, {"Exposure 1#(lf)Score 1", type number}, {"Exposure 2#(lf)Score 1", type number}, {"Exposure 3#(lf)Score 1", type number}, {"Exposure 1#(lf)Score 22", type number}, {"Exposure 2#(lf)Score 22", type number}, {"Exposure 3#(lf)Score 22", type number}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"A", "Drug"}})
in
#"Renamed Columns"
EDIT
#user3316549 commented below that he might have multiple entries for the same drug for the same Score/Exposure and wanted the results for each shown separately.
A Pivot table would be useful here, except a classic pivot table will only have a single entry for each intersection of Drug with Score/Exposure.
This problem is solved with a custom function for the pivot that adds an extra row when needed. The credits for that function are included and you can examine the link for a detailed explanation of the algorithm used for that part of the code.
The custom function is added as a blank query. You can name it what you choose and call it that way in your main code.
M Code
Main Query
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Unpivot the Score columns to put them in a single column
#"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {"Sample ID","Exposure Level", "Drug"}, "Attribute", "Value"),
//sort by multiple columns so the results will be properly ordered to our liking
#"Sorted Rows" = Table.Sort(#"Unpivoted Columns",{{"Attribute", Order.Ascending}, {"Exposure Level", Order.Ascending}, {"Drug", Order.Ascending},{"Sample ID", Order.Ascending}}),
//Create what will become a two line header column
// and remove the originals
#"Added Custom" = Table.AddColumn(#"Sorted Rows", "Headers", each [Attribute] & "#(lf)" & "Exposure " & Text.From([Exposure Level])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Sample ID","Exposure Level", "Attribute"}),
//custom pivot function for non-aggregation
pivotAll = fnPivotAll(#"Removed Columns","Headers","Value")
in
pivotAll
M Code
Custom Function named fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"

Excel dynamically transpose every time an email address is found

I have a column in excel containing a long list similar to the following:
alfa.zulu#test.com
9v46by8
9016767312
TX961779
1DM90F4
bravo.zulu#test.com
B935536
24086942
9486388284
UAUG350583
0P47MB2
asd65f4
813asdg
357yvjy
jxvn97
iopu634
charlie.zulu#test.com
1DM90F4
0P47MB2
delta.zulu#test.com
9016767312
asd65f4
357yvjy
iopu634
echo.zulu#test.com
9v46by8
TX961779
B935536
I need to transpose the list, BUT every time I have an email address, I need to jump on down to the next row and start all over, such as the following:
alfa.zulu#test.com 9v46by8 9016767312 TX961779 1DM90F4
bravo.zulu#test.com B935536 24086942 9486388284 UAUG350583 0P47MB2 asd65f4 813asdg 357yvjy
charlie.zulu#test.com 1DM90F4 0P47MB2
delta.zulu#test.com 9016767312 asd65f4 357yvjy iopu634
echo.zulu#test.com 9v46by8 TX961779 B935536
Is there any way to achieve this without using vba?
Thanks in advance!
This can be done by combining the INDEX, AGGREGATE and SEARCH functions.
But there are some prerequisites:
The SEARCH function will search for cells with the # symbol - so it should be only in email addresses
At the end of the list, the # symbol must be entered in the first blank cell
Formula:
=IFERROR(INDEX(INDEX($A$1:$A$30,AGGREGATE(15,6,(1/ISNUMBER(SEARCH("#",$A$1:$A$30)))*ROW($A$1:$A$30),ROW())):INDEX($A$1:$A$30,AGGREGATE(15,6,(1/ISNUMBER(SEARCH("#",$A$1:$A$30)))*(ROW($A$1:$A$30)-1),ROW()+1)),COLUMN()-2),"")
If the list is very long, it may be better to follow Ron's advice.
With Power Query:
Make the column data type = text
Test if an entry is email -- using the # but could be more sophisticated
Add an Index column
Add another column which contains a unique number each time there is an email in column 1
Fill down with the unique numbers so each "group" will have the same number
Group the rows on the unique numbers column
Extract the data from each row into a delimited list
Add some logic to enable variations in the numbers of potential columns, else power query will not adapt.
Split the list of data into new columns based on the delimiter
Along the way, we delete extraneous columns
Paste the code below into the Power Query Editor
Change the Table in Line 2 to reflect the real table name in your worksheet.
Double click on the statements in the Applied Steps window to explore what is being done at each step
A refresh is all that should be required if your data table changes.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "isEmail", each Text.Contains([Column1],"#")),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 0, 1, Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Grouper", each if [isEmail] then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Grouper"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"isEmail", "Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Grouper"}, {{"Grouped", each _, type table [Column1=nullable text, Grouper=number]}}),
#"Added Custom2" = Table.AddColumn(#"Grouped Rows", "Value", each Table.Column([Grouped],"Column1")),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom2",{"Grouper", "Grouped"}),
#"Added Custom3" = Table.AddColumn(#"Removed Columns2", "numSplits", each List.Count([Value])),
//Make column splitting dynamic for each refresh, in case maximum number of columns changes
splits = List.Max(Table.Column(#"Added Custom3","numSplits")),
newColList = List.Zip({List.Repeat({"Value"},splits),List.Generate(() => 1, each _ <= splits, each _ +1)}),
#"Converted to Table" = Table.FromList(newColList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
newColNamesTbl = Table.TransformColumns(#"Converted to Table", {"Column1", each Text.Combine(List.Transform(_, Text.From)), type text}),
newColNamesList = Table.Column(newColNamesTbl,"Column1"),
#"Extracted Values" = Table.TransformColumns(#"Added Custom3", {"Value", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Removed Columns1" = Table.RemoveColumns(#"Extracted Values",{"numSplits"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns1", "Value", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), newColNamesList)
in
#"Split Column by Delimiter"
Source Data
Results

Power Query transpose and pivot list

I have the following list in Excel Powerquery:
I would like to transform this list within the Power Query editor to make the following list:
Assuming a source table called Table1, you could use this (in the advanced editor):
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Buffered = Table.Buffer(Source),
#"Grouped Rows" = Table.Group(Buffered, {"value"}, {{"AllRows", each Table.AddIndexColumn(_, "Record", 1, 1), type table}}),
#"Expanded AllRows" = Table.ExpandTableColumn(#"Grouped Rows", "AllRows", {"value2", "Record"}, {"value2", "Record"}),
#"Pivoted Column" = Table.Pivot(#"Expanded AllRows", List.Distinct(#"Expanded AllRows"[value]), "value", "value2"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Record"})
in
#"Removed Columns"

Resources