I need to to stack multiple columns into one using an if statement as the example below
Original table looks like the following:
Type
ID
Name
State
X
Y
Pay
01
Joe
NY
-5
0
Pay
02
Ann
FL
-2
-4
Receive
03
Lee
TX
1
0
Pay
04
Ken
CA
0
-1
Receive
05
John
NY
3
2
I would like to have the columns Type, ID, X and Y to be copied from sheet1 to sheet2 using the following conditions:
if Type = "Pay" and X <> 0 then copy columns "Type", "ID" and X * (-1)
if Type = "Pay" and Y <> 0 then copy columns "Type", "ID" and Y * (-1)
if Type = "Receive" and X <> 0 then copy columns "Type", "ID" and X
if Type = "Receive" and Y <> 0 then copy columns "Type", "ID" and Y
I would Like the final result to look like the following:
Type
ID
#
Pay
01
5
X
Pay
02
2
X
Receive
03
1
X
Receive
05
3
X
Pay
02
4
Y
Pay
04
1
Y
Receive
05
2
Y
Please help me
Thanks Phil
As #ScottCraner implied, you can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
Note that your "conditions", when applied to the unPivot table, are the same as
Filter out the zero values
Multiple the "Pay" values by -1
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table33"]}[Content],
//delete unneeded columns
#"Removed Columns" = Table.RemoveColumns(Source,{"Name", "State"}),
//set data type
typeIt = Table.TransformColumnTypes(#"Removed Columns",{
{"Type", Text.Type},
{"ID", Text.Type},
{"X", Int64.Type},
{"Y", Int64.Type}
}),
//unPivot, then remove the rows with zeros's
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(typeIt, {"Type", "ID"}, "Attribute", "Value"),
#"Filtered Rows" = Table.SelectRows(#"Unpivoted Other Columns", each ([Value] <> 0)),
//add column where Pay amount are multiplied by -1
//remove unneeded Value column
//Sort and reorder the columns
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "#", each if [Type]="Pay" then [Value] * -1 else [Value]),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Value"}),
#"Sorted Rows" = Table.Sort(#"Removed Columns1",{{"Attribute", Order.Ascending}, {"ID", Order.Ascending}}),
#"Reordered Columns" = Table.ReorderColumns(#"Sorted Rows",{"Type", "ID", "#", "Attribute"})
in
#"Reordered Columns"
Related
I have two Excel Tables with a 'lookup' column to merge against. I want to merge to a new table with all the lookup values expanded. If I were doing this in python or some such, the pseudo-code would be something like:
for unique day in Tbl1
row1 = day
row2 = ""
for event in Tbl1 day
v = event's lookup value in Tbl2
row1 += event + (len(v) - 1) blank columns
row2 += v
print(row1)
print(row2)
I'd like to avoid VBA, but would like to use new dynamic array functions (preferred) or power query (if necessary), but I can't figure out how to get the repeat to happen. The power query merges I've tried aren't complete.
The original data (where I've used abbreviations for my real data), has a number of events per day. The 'lookup' column shows the different levels of that event for that day.
Tbl1
day
event
lookup
1
Re
eoni2
1
Gr
eoni1
1
We
eoni1
2
Tn
eoneonii2
2
Ga
eon1
2
Gr
eoni1
Tbl2
lookup
c1
c2
c3
c4
c5
c6
c7
c8
eeononii
E
E
O
N
O
N
I
I
eon1
E
O
N
eoneonii2
E
O
N
E
O
N
I
I
eoni1
E
O
N
I
eoni2
E
E
O
O
N
N
I
I
Tbl1
Data will change: number of events per day, event value, what lookup value might be for an event.
The 'event' may or may not repeat from one day to the next, but will be unique within a day.
Order (top to bottom) should be maintained in resulting merge (left to right).
Max number of days = 3.
Tbl2
generally static and top to bottom order can be changed if needed.
may contain entries that are not used by Tbl1.
min of 3 and max of 8 values per row.
Tbl3 output
ideally, the 'event' name would not repeat, as shown below, but can if it keeps formula cleaner.
the number of columns for each day in output Tbl3 may not be the same, as shown, e.g. day 1 rows have 16 and day 2 rows have 15 here.
The output I want:
Tbl3
day
e1
e2
e3
e4
e5
e6
e7
e8
e9
e10
e11
e12
e13
e14
e15
e16
1
Re
Gr
We
E
E
O
O
N
N
I
I
E
O
N
I
E
O
N
I
2
Tn
Ga
Gr
E
O
N
E
O
N
I
I
E
O
N
E
O
N
I
Thanks much.
This can be accomplished using Power Query, available in Windows Excel 2010+ and Excel 365 (Windows or Mac)
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range or from within sheet
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Read in both tables
//Edit Source and Source1 lines to reflect your actual table names
Source = Excel.CurrentWorkbook(){[Name="Tbl_1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"day", Int64.Type}, {"event", type text}, {"lookup", type text}}),
Source1 = Excel.CurrentWorkbook(){[Name="Tbl_2"]}[Content],
#"Changed Type1" = Table.TransformColumnTypes(Source1,
List.Transform(Table.ColumnNames(Source1), each {_, type text})),
//Join the two tables based on the lookup column
//then remove that column
#"Join Tables" = Table.NestedJoin(#"Changed Type","lookup", #"Changed Type1","lookup", "joined"),
#"Removed Columns" = Table.RemoveColumns(#"Join Tables",{"lookup"}),
//Add index column to maintain original Event order
#"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1, Int64.Type),
//Expand the joined table and remove the Index column
#"Expanded joined" = Table.ExpandTableColumn(#"Added Index", "joined", {"c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8"}, {"c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8"}),
#"Removed Columns2" = Table.RemoveColumns(#"Expanded joined",{"Index"}),
//Unpivot all the "value" columns
//Then remove the "Attribute" column (the previous column Headers)
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Removed Columns2", {"day", "event"}, "Attribute", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
//Group by "day"
#"Grouped Rows" = Table.Group(#"Removed Columns1", {"day"}, {
{"event & value", (t)=> let
remDay = Table.RemoveColumns(t,"day"),
//replace all except first event of a type with null
nullEvents = List.Accumulate(t[event],{}, (state,current)=>
if state = {} then {current}
else if List.Contains(state,current) then state & {null}
else state & {current}),
//then create new table and Transpose to get final format
newTable = Table.Transpose(Table.FromColumns(
{nullEvents, t[Value]}
))
in
newTable}
}),
//Calculate number of columns for creating column names
numCols = List.Max(List.Transform(#"Grouped Rows"[#"event & value"], each Table.ColumnCount(_))),
//expand the grouped columns and set the appropriate names
#"Expanded event & value" = Table.ExpandTableColumn(#"Grouped Rows", "event & value",
List.Transform(List.Numbers(1,numCols), each "Column" & Text.From(_)),
List.Transform(List.Numbers(1,numCols), each "e" & Text.From(_))),
//Replace alternate "day" with null
replaceWithNulls = Table.FromColumns(
{List.Accumulate(#"Expanded event & value"[day], {}, (state,current)=>
if Number.IsOdd(List.Count(state))
then state & {null} else state & {current})} &
Table.ToColumns(Table.RemoveColumns(#"Expanded event & value","day")),
Table.ColumnNames(#"Expanded event & value")
),
//set the data types
typeit = Table.TransformColumnTypes(replaceWithNulls,
{{"day", Int64.Type}} & List.Transform(List.RemoveFirstN(Table.ColumnNames(replaceWithNulls),1), each {_, type text}))
in
typeit
Original data:
I want to transform them like this:
I tried to pivot it in Power Query. But the order is not correct. The column with empty value would fill up:
Since your Measurement ID's are numeric and sequential within each series
Add a 1-based index column.
Then add a custom column
Formula = [Index]-[Measurement ID]
If the ID sequence is broken, the formula will return a different result.
If the Measurement ID's in your actual data do not fit that pattern, it should be relatively easy to create an equivalent index that does match that pattern, and then use the same algorithm
Now, when you Pivot, you will get your desired outcome.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Measurement ID", Int64.Type}, {"Measurement Result", type number}}),
#"Added Index" = Table.AddIndexColumn(
#"Changed Type", "Index", 1, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom",
each [Index]-[Measurement ID]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {
{"Measurement ID", type text}}, "en-US"),
List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {
{"Measurement ID", type text}}, "en-US")[#"Measurement ID"]), "Measurement ID", "Measurement Result"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Custom"})
in
#"Removed Columns1"
If your Measurement ID column is not in the designated pattern
I make the assumption that each Series starts with the first ID in the column.
To create our Custom series, we can then use (after inserting the Index column),
a formula that returns an Index number if the value in the ID column is the same as the first, otherwise return a null
Then 'Fill Down'
#"Added Custom" = Table.AddColumn(#"Added Index", "sequence",
each if [Measurement ID] = #"Added Index"[Measurement ID]{0} then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"sequence"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
It looks like you expect Power Query to implicitly know that Measurement ID 4 belongs to a 2nd set of data?
It won't do that for you unless you specify whether each measurement belongs to a 1st, 2nd or 3rd set.
You could:
Write the set IDs in manually to a new column
Calculate them programatically e.g New column with value that increments +1 whenever the current measurement ID is less than the previous measurement ID
Go back to the source data and check if you can have Measurement ID 4 = null in the 1st and 3rd sets.
For instance, with the third option your table would perhaps resemble:
Set
ID
Result
1
1
a
1
2
b
1
3
c
1
4
null
2
1
d
2
2
e
2
3
f
2
4
g
3
1
h
3
2
i
3
3
j
3
4
null
There isn't enough information about your data, therefore the details & the correct solution need to be left to you.
Using Excel 365 Powerquery.
I have two datasources table1 and table2 with the following entries:
table1:
ID | salary
===========
1 | 10
2 | 1000
table2:
ID | inclminval | exclmaxval | class | display
20 | 0 | 100 | P1 | Poor man
30 | 100 | 9999 | P9 | Wealthy
I would like to append to table1: For every entry table1.salary,
compare it to the range table2.inclminval <= table1.salary < table2.exclmaxval of table2 and
use the matching corresponding entry of table2.class as a new computed column in table1.
On table2, add column ... custom column ... with column name custom and formula =1
One table1, add column ... custom column ... with column name custom and formula =1
home .. merge queries ...
choose and match the custom column in both tables using a Full Outer join
Use the arrows atop the new column to [x] expand the inclminval, exclmaxcal, class and display columns
Add column ... custom column with formula similar to
= if [salary]>=[inclminval] and [salary]<[exclmaxcal] then "keep" else "remove"
Use arrow atop the new column to filter for [x] keep
Remove extra columns
Sample code for table1
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each 1),
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Custom"},Table2,{"Custom"},"Table2",JoinKind.FullOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"inclminval", "exclmaxcal", "class", "display"}, {"inclminval", "exclmaxcal", "class", "display"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Table2", "Custom.1", each if [salary]>=[inclminval] and [salary]<[exclmaxcal] then "keep" else "remove"),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Custom.1] = "keep")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom", "inclminval", "exclmaxcal", "display", "Custom.1"})
in #"Removed Columns"
I have produced a table like the one below by using 'group' function in excel power query
score 1 score 2 score 3
A 6 25 50
B 8 30 20
C 15 15 30
D 20 0 10
I want to add a totals row (equivalent to "show totals for column" in a normal pivot table), so result would be like this
score 1 score 2 score 3
A 6 25 50
B 8 30 20
C 15 15 30
D 20 0 10
Total 49 70 110
Anyone knows if there is a simple way to do this? Thank you, RY
Another way:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
group = Table.Group(Source, {}, {{"letter", each "Total"},
{"score 1", each List.Sum([score 1])},
{"score 2", each List.Sum([score 2])},
{"score 3", each List.Sum([score 3])}}),
append = Table.Combine({Source, group})
in
append
Or:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
cols = Table.ColumnNames(Source),
group = Table.Group(Source, {}, List.Zip({cols, {each "Total"}&
List.Transform(List.Skip(cols),
(x)=>each List.Sum(Table.Column(_,x)))})),
append = Table.Combine({Source, group})
in
append
Or:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
group = Table.Group(Source, {}, List.TransformMany(Table.ColumnNames(Source),
(x)=>{each if x = "letter" then "Total"
else List.Sum(Table.Column(_,x))}, (x,y)=>{x,y})),
append = Table.Combine({Source, group})
in
append
I don't know why you would do this in PowerQuery rather than a pivottable but the only way I can think of is to duplicate the table, unpivot the columns and then repivot it using sum as an aggregation. Then you could append the table to your orginal query.
Your code would look something like for the table you want to create the totals in.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {}, "Attribute", "Value"),
#"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[Attribute]), "Attribute", "Value", List.Sum)
in
#"Pivoted Column"
Note you may have to choose to aggregate by count but you can change the code to sum (On the #"Pivot Column" line change List.Count to List.Sum.
You will get an error for the A,B,C,D col - you can replace this with Total if you would like by using the Replace Errors function.
If you are wanting to add a total row to a Power Query that was loaded to a table. Just skip a row and add a total row. This seems to work for me. Just make sure you lock the first cell in your sum formulas. ex =SUM($C$2:C25)
Example 1
Example 2
I have a data file with around 400 columns in it. I need to import this data into PowerPivot. In order to reduce my file size, I would like to use PowerQuery to create 2 different row totals, and then delete all my unneeded columns upon load.
While my first row total column (RowTotal1) would summate all 400 columns, I would also like a second row total (RowTotal2) that subtracts from RowTotal1 any column whose name contains the text "click" in it.
Secondly, I would like to use the the value in my Country column as a variable, to also subtract any column that contains this var. e.g.
Site----Country----Col1----Col2----ClickCol1----Col3----Germany----RowTotal1----RowTotal2
1a--------USA----------2---------4-----------8------------16----------24--------------54---------------46-------
2a-----Germany-------2---------4-----------8------------16----------24--------------54---------------22-------
RowTotal1 = 2 + 4 + 8 + 16 + 24
RowTotal2 (first row) = 54 - 8 (ClickCol1)
RowTotal2 (second row) = 54 - 24 (Germany) - 8 (ClickCol1)
Is this possible? (EDIT: Yes. See answer below)
REVISED QUESTION: Is there a more memory efficient way to do than trying to group 300+ million rows at once?
Code would look something like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Site", type text}, {"Country", type text}, {"Col1", Int64.Type}, {"Col2", Int64.Type}, {"ClickCol1", Int64.Type}, {"Col3", Int64.Type}, {"Germany", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Country", "Site"}, "Attribute", "Value"),
#"Added Conditional Column" = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Country] = [Attribute] or [Attribute] = "ClickCol1" then 0 else [Value] ),
#"Grouped Rows" = Table.Group(#"Added Conditional Column", {"Site", "Country"}, {{"RowTotal1", each List.Sum([Value]), type number},{"RowTotal2", each List.Sum([Value2]), type number}})
in
#"Grouped Rows"
But since you have a lot of columns, I should explain the steps:
(Assuming you have these in Excel file) Import them to Power Query
Select "Site" and "Country" columns (with Ctrl), right click > Unpivot Other Columns
Add Column with this formula (you might need to use Advanced Editor): Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Country] = [Attribute] or [Attribute] = "ClickCol1" then 0 else [Value])
Select Site and Country columns, Right Click > Group By
Make it look like this: