I have this table in SQL which I will constantly have new storedId and newproductID when I will refresh my data table.
StoreID ProductID Quantity
1 A 4
1 B 5
1 B 9
2 B 3
2 C 4
.
.
7 F 2
I want to group my StoreID together and transpose the Product ID with the sums of Quantity like the table bellow.
Each of my store can have between 0 and 30 products. So If i have 30 products, I would like to have 30 columns "productID" and 30 columns "SumQuantity"
StoredId ProductId1 ProductID2 ... SumQuantity1 SumQuantity2 ...
1 A B 4 14
2 B C 3 4
.
.
7 F 2
.
.
How i can do this in Power Query Excel?
Its hard to do this without custom coding.
Basically ..
Group on StoreID and ProductID then combine the Quantity
Group again on StoreID, and then combine rows for ProductID and Quantity with semicolon delimiter
Split out the columns on the delimiter. This requires 4 lines of custom code, since we don't know how many columns we need in advance.
Convert the numerical ones to numbers from text, which was needed format for above steps
Sample:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXIEYhOlWB0IzwmITVF4lnCeMxAbGsG5LiCuMZhrBFWL4DnDTTUHstyAGKgxFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [StoreID = _t, ProductID = _t, Quantity = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Quantity", type number}}),
// group and sum quantities
#"Grouped Rows" = Table.Group(#"Changed Type", {"StoreID", "ProductID"}, {{"Quantity", each List.Sum([Quantity]), type number}}),
// group second time and combine ProductID and Quantity from different rows onto a single row with ; as delimiter
#"Grouped Rows1" = Table.Group(#"Grouped Rows", {"StoreID"}, {{"ProductID", each Text.Combine(List.Transform([ProductID], each Text.From(_)), ";"), type text},{"Quantity", each Text.Combine(List.Transform([Quantity], each Text.From(_)), ";"), type text}}),
//Dynamically split the two delimited columns into any number of columns
DynamicColumnList = List.Transform({1 ..List.Max(Table.AddColumn(#"Grouped Rows1","Custom", each List.Count(Text.PositionOfAny([ProductID],{";"},Occurrence.All)))[Custom])+1}, each "ProductID." & Text.From(_)),
DynamicColumnList2 = List.Transform({1 ..List.Max(Table.AddColumn(#"Grouped Rows1","Custom", each List.Count(Text.PositionOfAny([Quantity],{";"},Occurrence.All)))[Custom])+1}, each "Quantity." & Text.From(_)),
#"Split Column by Delimiter" = Table.SplitColumn( #"Grouped Rows1", "ProductID", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), DynamicColumnList),
#"Split Column by Delimiter2" = Table.SplitColumn( #"Split Column by Delimiter" , "Quantity", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), DynamicColumnList2),
// convert numerical to numbers
#"ConvertToNumbers" = Table.TransformColumnTypes (#"Split Column by Delimiter2", List.Transform ( List.Difference(Table.ColumnNames(#"Split Column by Delimiter2"),Table.ColumnNames(#"Split Column by Delimiter")),each {_,type number}))
in #"ConvertToNumbers"
I'm not sure why you'd want data in such a terrible format but here's how I'd approach it.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXIEYhOlWB0IzwmITVF4lnCeMxAbGsG5LiCuMZhrBFWL4DnDTTUHstyAGKgxFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [StoreID = _t, ProductID = _t, Quantity = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"StoreID", Int64.Type}, {"ProductID", type text}, {"Quantity", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"StoreID", "ProductID"}, {{"SumQuantity", each List.Sum([Quantity]), type nullable number}}),
#"Custom Grouping" = Table.Group(#"Grouped Rows", {"StoreID"},
{
{"Products", each Table.FromRows({[ProductID]},List.Transform({1..List.Count([ProductID])}, each "ProductID" & Number.ToText(_))), type table},
{"Quantities", each Table.FromRows({[SumQuantity]},List.Transform({1..List.Count([SumQuantity])}, each "SumQuantity" & Number.ToText(_))), type table},
{"Count", Table.RowCount, Int64.Type}
}
),
ColumnsToExpand = List.Max(#"Custom Grouping"[Count]),
ProductColumns = List.Transform({1..ColumnsToExpand}, each "ProductID" & Number.ToText(_)),
QuantityColumns = List.Transform({1..ColumnsToExpand}, each "SumQuantity" & Number.ToText(_)),
#"Expanded Products" = Table.ExpandTableColumn(#"Custom Grouping", "Products", ProductColumns, ProductColumns),
#"Expanded Quantities" = Table.ExpandTableColumn(#"Expanded Products", "Quantities", QuantityColumns, QuantityColumns)
in
#"Expanded Quantities"
The first grouping on StoreID and ProductID is the same as #horseyride suggests but I go a different way from there.
The next step is to group by only StoreID and construct the desired table for each store. This step result looks something like this in the editor (the preview on the bottom is what the selected cell contains):
Let's look at how this table is constructed for "Products".
Table.FromRows(
{[ProductID]},
List.Transform(
{1..List.Count([ProductID])},
each "ProductID" & Number.ToText(_)
)
)
This list [ProductID] is the list of IDs associated with the current store. For the cell selected, it's simply {"A","B"}. Since there are two values, the List.Count is 2. So the above simplifies to this:
Table.FromRows(
{{"A", "B"}},
List.Transform(
{1,2},
each "ProductID" & Number.ToText(_)
)
)
After the list transform this is simply
Table.FromRows({{"A", "B"}}, {"ProductID1", "ProductID2"})
which looks like the preview in the screenshot above.
The construction for the quantities data is analogous. After that column is defined, all that is left is to expand both these columns:
Edit:
As #horseyrides points out, the expansion needs to be made dynamic, so I've added a column to the custom grouping to determine the number of columns to expand into. This number is the maximum count of products for one store, which is then used to generate a list of column names.
Related
I have been learning and thesame time carrying out a project using powerquery.
I am trapped on adding column values.Some of the column values contain text.I intend to sum in each record of my table, all values with integer type.However, there is a challenge .
When i add up the column values with the interger type,i get a wrong answer.Secondly, this column headers are dynamic.
How do i sum effectively dynamic column headers in powerquery
Example: My Challenge: When i sum the column with interger type like this [MEC101]+[THER305] i get a null values on some records and i dont know why?
When wrapped the sum using list.sum function, it partially works ,buh whenever,one of the column headers is missing, it gives a wrong answer.I want a suituation, when a column header is missing, it will ignore the missing column headers and sum the values from the available column headers.
Thank you.
ID
MEC101
MEC-GRADE
THER305
THER305-GRADE
TOTAL
1002
70
A
40
D
1003
50
C
60
B
1004
60
B
30
F
EXPECTED RESULTS 1:
ID
MEC101
MEC-GRADE
THER305
THER305-GRADE
TOTAL
1002
70
A
40
D
110
1003
50
C
60
B
110
1004
60
B
30
F
90
EXPECTED RESULTS 2:
ID
MEC101
MEC-GRADE
TOTAL
1002
70
A
70
1003
50
C
50
1004
60
B
60
try this which will sum the numeric columns, excluding 1st column
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
totals = Table.AddColumn(Source, "Sum", each List.Sum(List.Transform(List.RemoveNulls(List.RemoveFirstN(Record.FieldValues(_),1)), each if Value.Is(_,type number) then _ else 0)))
in totals
EDIT try this which will sum the numeric columns, excluding 1st and 2nd columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
totals = Table.AddColumn(Source, "Sum", each List.Sum(List.Transform(List.RemoveNulls(List.RemoveFirstN(Record.FieldValues(_),2)), each if Value.Is(_,type number) then _ else 0)))
in totals
or, using unpivot, grouping and merging:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Custom", each if Value.Is([Value],type number) then [Value] else null),
#"Grouped Rows" = Table.Group(#"Added Custom", {"ID"}, {{"Total", each List.Sum([Custom]), type nullable number}}),
#"Merged Queries" = Table.NestedJoin(Source, {"ID"}, #"Grouped Rows", {"ID"}, "Table1", JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Total"}, {"Total"})
in #"Expanded Table1"
Try this.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQwMFJQ0lEyNwCRjiDCBMx0UVCK1QHLG4O4pmBBZxBhBmY6weVNUAR1lIzBTDegfCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, MEC101 = _t, #"MEC-GRADE" = _t, THER305 = _t, #"THER305-GRADE" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"MEC101", Int64.Type}, {"MEC-GRADE", type text}, {"THER305", Int64.Type}, {"THER305-GRADE", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each let
listA = Record.ToList(Record.RemoveFields(_,"ID")),
result = List.Accumulate(listA, 0,(state, current) => if Value.Is(current, Number.Type) then state + current else state)
in result)
in
#"Added Custom"
Original data:
I want to transform them like this:
I tried to pivot it in Power Query. But the order is not correct. The column with empty value would fill up:
Since your Measurement ID's are numeric and sequential within each series
Add a 1-based index column.
Then add a custom column
Formula = [Index]-[Measurement ID]
If the ID sequence is broken, the formula will return a different result.
If the Measurement ID's in your actual data do not fit that pattern, it should be relatively easy to create an equivalent index that does match that pattern, and then use the same algorithm
Now, when you Pivot, you will get your desired outcome.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Measurement ID", Int64.Type}, {"Measurement Result", type number}}),
#"Added Index" = Table.AddIndexColumn(
#"Changed Type", "Index", 1, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom",
each [Index]-[Measurement ID]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {
{"Measurement ID", type text}}, "en-US"),
List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {
{"Measurement ID", type text}}, "en-US")[#"Measurement ID"]), "Measurement ID", "Measurement Result"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Custom"})
in
#"Removed Columns1"
If your Measurement ID column is not in the designated pattern
I make the assumption that each Series starts with the first ID in the column.
To create our Custom series, we can then use (after inserting the Index column),
a formula that returns an Index number if the value in the ID column is the same as the first, otherwise return a null
Then 'Fill Down'
#"Added Custom" = Table.AddColumn(#"Added Index", "sequence",
each if [Measurement ID] = #"Added Index"[Measurement ID]{0} then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"sequence"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
It looks like you expect Power Query to implicitly know that Measurement ID 4 belongs to a 2nd set of data?
It won't do that for you unless you specify whether each measurement belongs to a 1st, 2nd or 3rd set.
You could:
Write the set IDs in manually to a new column
Calculate them programatically e.g New column with value that increments +1 whenever the current measurement ID is less than the previous measurement ID
Go back to the source data and check if you can have Measurement ID 4 = null in the 1st and 3rd sets.
For instance, with the third option your table would perhaps resemble:
Set
ID
Result
1
1
a
1
2
b
1
3
c
1
4
null
2
1
d
2
2
e
2
3
f
2
4
g
3
1
h
3
2
i
3
3
j
3
4
null
There isn't enough information about your data, therefore the details & the correct solution need to be left to you.
Using power query we can split a column into data that is non digit to digit which works great when you have a value such as Lead 10 to split into Lead and 10 however is there anyway to split in the same way if the number is a decimal e.g. Lead 20.5. Using split non digit to digit splits this is Lead 20. 5
I have the following example data I wish to split as follows:
Lead 20.5 --> `Lead` `20.5`
No Data --> `null`
Arsenic 10 --> `Arsenic` `10`
Gold 50.55 --> `Gold` `50.55`
1,4-Dioxane 21 --> `1,4-Dioxane` `21`
Previously I used split by right most "" however this splits No Data into separate words.
Any ideas on how to achieve this would be great.
Update 1: Issue 1,4-Dioxane
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Split Column by Character Transition" = Table.SplitColumn(#"Changed Type", "Column1",
Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9","."}, c), {"0".."9","."}), {"Column1.1", "Column1.2"})
in
#"Split Column by Character Transition"
How to do this depends on your data.
Edit
to account for additional data sample with digits in chemical name
Algorithm
Test the last word
if last word is NOT a number, then replace spaces with NBSP
Then split on the rightmost space.
I will use a custom function to check the last word and modify the string if the last word is not a space
Custom Function M Code:
enter as a blank query and rename it: fnConvString
Edited to improve computation
//Rename this query "fnConvString"
(string as text) =>
let
lastWord = Text.AfterDelimiter(string," ",{0,RelativePosition.FromEnd}),
lastIsNumber = try Value.Type(Number.FromText(lastWord)) = type number otherwise false,
replSpace = if lastIsNumber = false then Text.Replace(string," ",Character.FromNumber(160)) else string
in
replSpace
Main MCode
Edited to simplify code with no added columns
let
Source = Excel.CurrentWorkbook(){[Name="Table29"]}[Content],
addNBSP = Table.TransformColumns(Source,{"Column1", each fnConvString(_)}),
#"Split Column by Delimiter" = Table.SplitColumn(addNBSP, "Column1",
Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
#"Changed Type" = Table.TransformColumnTypes(
#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type number}})
in
#"Changed Type"
Edit without custom function
If you would prefer to not use a custom function, you can incorporate that within the main code as a Transform Operation:
M Code without custom function
let
Source = Excel.CurrentWorkbook(){[Name="Table29"]}[Content],
addNBSP = Table.TransformColumns(Source,{"Column1", each
let
lastWord = Text.AfterDelimiter(_," ",{0,RelativePosition.FromEnd}),
lastIsNumber = try Value.Type(Number.FromText(lastWord)) = type number otherwise false,
replSpace = if lastIsNumber = false then Text.Replace(_," ",Character.FromNumber(160)) else _
in
replSpace
}),
#"Split Column by Delimiter" = Table.SplitColumn(addNBSP, "Column1",
Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
#"Changed Type" = Table.TransformColumnTypes(
#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type number}})
in
#"Changed Type"
In powerquery, based on sample data, looks like you could just split on the space character.
Right click column .. split column .. by delmiter ... delimiter:space ... Split at: leftmost-delimiter
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Column1.1", "Column1.2"})
in #"Split Column by Delimiter"
If data doesn't like that method, you could parse numerical from alpha
Add custom column with formula
= Text.Remove([Column1],{"0".."9","."})
to get the text only portion, and adding a second custom column with formula
=try Text.Remove([Column1],Text.ToList(Text.Remove([Column1],{"0".."9","."}))) otherwise null
to get the numerical portion
Sample full code
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Text", each Text.Remove([Column1],{"0".."9","."})),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Numeric", each try Text.Remove([Column1],Text.ToList(Text.Remove([Column1],{"0".."9","."}))) otherwise null)
in #"Added Custom1"
I have data set of basic housing data in the following format:
Existing data format:
That format is the same and reapeats for hundrets of properties. I would like to transform that that into a table format like the following example:
Property Type
Price
Location
Region
Additional info
Area
House
252000
London
Kensington
4500 square meters
...
...
...
...
...
etc
In other words I want to make the text before ":" symbol column name with the text after it the data that goes into into the corresponding cell and to repeat that for hundrets of sites. Usually there is missing(no data) in Additional info but sometimes there is.
I am not shure which is the best program to do this. So far in my mind comes Excel but if there is an easier way I will be glad to use it.
As per my below screenshot Excel 365 I have used following formulas.
C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,1,4)),": ","</s><s>")&"</s></t>","//s[last()]")
D2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,2,4)),": ","</s><s>")&"</s></t>","//s[last()]")
E2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[2]")
F2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[last()-1]")
H2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,4,4)),": ","</s><s>")&"</s></t>","//s[last()]")
If you are not in Excel 365 then can try-
=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3),": ","</s><s>")&"</s></t>","//s[last()]")
Basically =ROW(A1)+(ROW(A1)-1)*3 will generate a sequence of row numbers and INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3) will return value from Column A as per that sequence. Then FILTERXML() will return expected value specified in xPath parameter.
To know, how FILTERXML() works yo can read this article from JvdV. This is a fantastic article for FILTERXML() lover.
You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
Note: The fnPivotAll function is a custom function that enables a method of creating a non-aggregated Pivot Table where there are multiple values per Pivot Column. From the UI, you add this as a New Query from Blank, and just paste that M-code in place of what's there
M-Code (for main query)
let
//Read in data
//Change table name in next line to your actural table name
Source = Excel.CurrentWorkbook(){[Name="Table1_2"]}[Content],
//Split by comma into new rows
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1",
Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),
let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
//Remove the blank rows
#"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Column1] <> "" and [Column1] <> " ")),
//Split by the rightmost colon only into new columns
#"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Column1",
Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
//Split by the remaining colon into new rows
// So as to have empty rows under "Additional data"
//Then Trim the columns to remove leading/trailing spaces
#"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter1", {{"Column1.1", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1.1"),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Column1.1", type text}, {"Column1.2", type text}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
//Create new column processing "Additional Data" to show a blank
// and Price to just show the numeric value, splitting from "EUR"
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each if [Column1.1] = "Additional data" then " "
else if [Column1.1] = "Price" then Text.Split([Column1.2]," "){1} else [Column1.2]),
//Remove unneeded column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1.2"}),
//non-aggregated pivot
pivot = fnPivotAll(#"Removed Columns","Column1.1","Custom"),
//set data types (frequently a good idea in PQ
#"Changed Type1" = Table.TransformColumnTypes(pivot,{
{"Property type", type text},
{"Location", type text},
{"region", type text},
{"Additional data", type text},
{"Area", type text},
{"Price", Currency.Type}})
in
#"Changed Type1"
M-Code (for custom function)
be sure to rename this query: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
I have a data file with around 400 columns in it. I need to import this data into PowerPivot. In order to reduce my file size, I would like to use PowerQuery to create 2 different row totals, and then delete all my unneeded columns upon load.
While my first row total column (RowTotal1) would summate all 400 columns, I would also like a second row total (RowTotal2) that subtracts from RowTotal1 any column whose name contains the text "click" in it.
Secondly, I would like to use the the value in my Country column as a variable, to also subtract any column that contains this var. e.g.
Site----Country----Col1----Col2----ClickCol1----Col3----Germany----RowTotal1----RowTotal2
1a--------USA----------2---------4-----------8------------16----------24--------------54---------------46-------
2a-----Germany-------2---------4-----------8------------16----------24--------------54---------------22-------
RowTotal1 = 2 + 4 + 8 + 16 + 24
RowTotal2 (first row) = 54 - 8 (ClickCol1)
RowTotal2 (second row) = 54 - 24 (Germany) - 8 (ClickCol1)
Is this possible? (EDIT: Yes. See answer below)
REVISED QUESTION: Is there a more memory efficient way to do than trying to group 300+ million rows at once?
Code would look something like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Site", type text}, {"Country", type text}, {"Col1", Int64.Type}, {"Col2", Int64.Type}, {"ClickCol1", Int64.Type}, {"Col3", Int64.Type}, {"Germany", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Country", "Site"}, "Attribute", "Value"),
#"Added Conditional Column" = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Country] = [Attribute] or [Attribute] = "ClickCol1" then 0 else [Value] ),
#"Grouped Rows" = Table.Group(#"Added Conditional Column", {"Site", "Country"}, {{"RowTotal1", each List.Sum([Value]), type number},{"RowTotal2", each List.Sum([Value2]), type number}})
in
#"Grouped Rows"
But since you have a lot of columns, I should explain the steps:
(Assuming you have these in Excel file) Import them to Power Query
Select "Site" and "Country" columns (with Ctrl), right click > Unpivot Other Columns
Add Column with this formula (you might need to use Advanced Editor): Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Country] = [Attribute] or [Attribute] = "ClickCol1" then 0 else [Value])
Select Site and Country columns, Right Click > Group By
Make it look like this: