I have a column in excel containing a long list similar to the following:
alfa.zulu#test.com
9v46by8
9016767312
TX961779
1DM90F4
bravo.zulu#test.com
B935536
24086942
9486388284
UAUG350583
0P47MB2
asd65f4
813asdg
357yvjy
jxvn97
iopu634
charlie.zulu#test.com
1DM90F4
0P47MB2
delta.zulu#test.com
9016767312
asd65f4
357yvjy
iopu634
echo.zulu#test.com
9v46by8
TX961779
B935536
I need to transpose the list, BUT every time I have an email address, I need to jump on down to the next row and start all over, such as the following:
alfa.zulu#test.com 9v46by8 9016767312 TX961779 1DM90F4
bravo.zulu#test.com B935536 24086942 9486388284 UAUG350583 0P47MB2 asd65f4 813asdg 357yvjy
charlie.zulu#test.com 1DM90F4 0P47MB2
delta.zulu#test.com 9016767312 asd65f4 357yvjy iopu634
echo.zulu#test.com 9v46by8 TX961779 B935536
Is there any way to achieve this without using vba?
Thanks in advance!
This can be done by combining the INDEX, AGGREGATE and SEARCH functions.
But there are some prerequisites:
The SEARCH function will search for cells with the # symbol - so it should be only in email addresses
At the end of the list, the # symbol must be entered in the first blank cell
Formula:
=IFERROR(INDEX(INDEX($A$1:$A$30,AGGREGATE(15,6,(1/ISNUMBER(SEARCH("#",$A$1:$A$30)))*ROW($A$1:$A$30),ROW())):INDEX($A$1:$A$30,AGGREGATE(15,6,(1/ISNUMBER(SEARCH("#",$A$1:$A$30)))*(ROW($A$1:$A$30)-1),ROW()+1)),COLUMN()-2),"")
If the list is very long, it may be better to follow Ron's advice.
With Power Query:
Make the column data type = text
Test if an entry is email -- using the # but could be more sophisticated
Add an Index column
Add another column which contains a unique number each time there is an email in column 1
Fill down with the unique numbers so each "group" will have the same number
Group the rows on the unique numbers column
Extract the data from each row into a delimited list
Add some logic to enable variations in the numbers of potential columns, else power query will not adapt.
Split the list of data into new columns based on the delimiter
Along the way, we delete extraneous columns
Paste the code below into the Power Query Editor
Change the Table in Line 2 to reflect the real table name in your worksheet.
Double click on the statements in the Applied Steps window to explore what is being done at each step
A refresh is all that should be required if your data table changes.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "isEmail", each Text.Contains([Column1],"#")),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 0, 1, Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Grouper", each if [isEmail] then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Grouper"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"isEmail", "Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Grouper"}, {{"Grouped", each _, type table [Column1=nullable text, Grouper=number]}}),
#"Added Custom2" = Table.AddColumn(#"Grouped Rows", "Value", each Table.Column([Grouped],"Column1")),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom2",{"Grouper", "Grouped"}),
#"Added Custom3" = Table.AddColumn(#"Removed Columns2", "numSplits", each List.Count([Value])),
//Make column splitting dynamic for each refresh, in case maximum number of columns changes
splits = List.Max(Table.Column(#"Added Custom3","numSplits")),
newColList = List.Zip({List.Repeat({"Value"},splits),List.Generate(() => 1, each _ <= splits, each _ +1)}),
#"Converted to Table" = Table.FromList(newColList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
newColNamesTbl = Table.TransformColumns(#"Converted to Table", {"Column1", each Text.Combine(List.Transform(_, Text.From)), type text}),
newColNamesList = Table.Column(newColNamesTbl,"Column1"),
#"Extracted Values" = Table.TransformColumns(#"Added Custom3", {"Value", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Removed Columns1" = Table.RemoveColumns(#"Extracted Values",{"numSplits"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns1", "Value", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), newColNamesList)
in
#"Split Column by Delimiter"
Source Data
Results
Related
I need to get these project description rows merged into a single row so that there will be consistency in the number of a rows per record so that I can transpose them into proper columns through Power Query. (see image) I understand how to execute a transpose w/ Power Query if the number of rows are consistent across records but I cannot figure out how to do this if the number of rows differ. The data comes from a PDF which is horribly formatted and breaks the Project Description information in to separate rows. < THAT IS THE KEY PROBLEM. Apart from that the rest is cake. See snippet to see what I mean.
Each transposed record will have seven columns:
Director Analysis
Address
Project
Area
Notice Date
Project Description
Appeal
I can get everything I need including the headers. I just can't figure out how to merge the rows under Project Description so that I can proceed w/ the transpose.
here is the link to view a screenshot of my sheet
This is a kludge but seems to work. Assumes the column we want to operate on is named column a in powerquery
It looks for anything between the rows that contain Project Description and Appeals must be
Create a shifted row, so we can see what is on the row above
Add index
Use custom columns to determine which rows need filtering out, and which rows are the start and end rows to combine based on the first column and the shifted first column
Merge text together based on that info, merge that back into original table, then remove the extra rows
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// create shifted row
shiftedList = {null} & List.RemoveLastN(Source[a],1),
custom3 = Table.ToColumns(Source) & {shiftedList},
custom4 = Table.FromColumns(custom3,Table.ColumnNames(Source) & {"Next Row Header"}),
#"Added Index" = Table.AddIndexColumn(custom4, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try if Text.Contains([Next Row Header],"Project Description" ) then [Index] else if Text.Contains([a],"Appeals must be") then [Index] else null otherwise 0),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
#"Added Custom1" = Table.AddColumn(#"Filled Down", "Custom.1", each try if Text.Contains([Next Row Header],"Project Desc") then "remove" else if Text.Contains([a],"Appeals must be") then "keep" else null otherwise "keep"),
#"Filled Down1" = Table.FillDown(#"Added Custom1",{"Custom.1"}),
#"Filtered Rows1" = Table.SelectRows(#"Filled Down1", each ([Custom.1] = "remove")),
#"Grouped Rows1" = Table.Group(#"Filtered Rows1", {"Custom"}, {{"Count", each Text.Combine(List.Transform([a], Text.From), ","), type text}}),
#"Merged Queries" = Table.NestedJoin(#"Filled Down1", {"Index"}, #"Grouped Rows1", {"Custom"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Count"}, {"Count"}),
#"SwapValue"= Table.ReplaceValue( #"Expanded Table2", each [Custom.1], each if [Count] = null then [Custom.1] else "keep", Replacer.ReplaceValue,{"Custom.1"}),
#"Final Swap"=Table.ReplaceValue(#"SwapValue",each [a], each if [Count]=null then [a] else [Count] , Replacer.ReplaceValue,{"a"}),
#"Filtered Rows" = Table.SelectRows(#"Final Swap", each ([Custom.1] = "keep")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Next Row Header", "Index", "Custom", "Custom.1", "Count"})
in #"Removed Columns"
Hi all, I have a column of text like the one in the picture, how can I split this type of column in multiple column for each occurrence of " |-Starting " substring?
create a "grouper" column to group the different sets of rows
then Group
Split each subgroup into columns
Transpose the results
eg
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//create a "grouper" column
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "grouper", each if [Column1] = "|-Starting" then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"grouper"}),
//group the rows creating a delimiter separated string
// and a counter to obtain the number of columns for the "Split"
#"Grouped Rows" = Table.Group(#"Filled Down", {"grouper"}, {
{"group", each Text.Combine([Column1],";"),type text},
{"numInGroup", each Table.RowCount(_)}
}),
//maximum number of columns in the result
numCols=List.Max(#"Grouped Rows"[numInGroup]),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"grouper","numInGroup"}),
//split; then transpose
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "group",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv),numCols),
#"Transposed Table" = Table.Transpose(#"Split Column by Delimiter")
in
#"Transposed Table"
Background:
I'm trying to create query that incorporates dynamic column names such that if a user changes the name of a column within a table, the logic still follows for any transformations.
The problem is quite straightforward but difficult to explain in just a few sentences. I have tried to break down everything I've done to easily show the problem I am having. If obvious please just skip to the problem at the end.
Demo of Dynamic Columns:
To begin making the column Names dynamic, I use:
DynamicNameHeader = Table.ColumnNames(Source)
This creates a list of the column names from the original table. If the Table name is altered in the spreadsheet, this list is dynamically updated. This list can be used to indirectly refer to columns within your M code.
As an example to show this, Here I have changed Column1 in Table1 to Hello and after refreshing the data the outputted table updates the column to read Hello accordingly.
This works, by referring to the DynamicNameHeader list and indexing for the desired column. The code also demonstrates simple transformations that can be achieved in this by changing the text to uppercase and reordering the columns.
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
DynamicHeaderNames =Table.ColumnNames(Source),
#"Uppercased Text" = Table.TransformColumns(#"Source",{{DynamicHeaderNames{0}, Text.Upper, type text}, {DynamicHeaderNames{1}, Text.Upper, type text}}),
#"Reordered Columns" = Table.ReorderColumns(#"Uppercased Text",{DynamicHeaderNames{1}, DynamicHeaderNames{0}})
in
#"Reordered Columns"
This is just an easy example to show how I'm trying to integrate Dynamic Columns in this way.
PROBLEM
Here is a simple version of the actual data that has been sorted in the outputted table such that the data in Column 1 is Prioritised over values in Column2. The numbers in column 3 just correspond to values in column2.
Working M code to achieve this output:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//combine the columns
#"Added Custom" = Table.AddColumn(Source, "Custom", each
let
L1 = if [Column2] = "-" then {[Column1]}
else List.Combine({{[Column1]},Text.Split([Column2],"#(lf)")}),
L2 = if [Column2] = "-" then {[Column3]}
else List.Combine({{"-"},Text.Split([Column3],"#(lf)")})
in
List.Zip({L1,L2})),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1", "Column2", "Column3"}),
//split the combined columns
#"Expanded Custom" = Table.ExpandListColumn(#"Removed Columns1", "Custom"),
#"Extracted Values" = Table.TransformColumns(#"Expanded Custom",
{"Custom", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values",
"Custom", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Column1", "Column3"})
in
#"Split Column by Delimiter"
I wish to make this code Dynamic as in the simplified example, however I am running into lots of syntax issues when I add the Custom Column step.
In essence I wish to reference columns 1, 2 and 3 indirectly by indexing the DynamicNameHeader list as before. Is this possible? Importantly, this isn't just to allow the column names to be altered by the user, but so that transformations to the data also can refer to the relevant columns dynamically too. This Custom Column Transformation is pretty much the only step proving difficult because it uses [] which dont appear to be compatible with DynamicNameHeader{x}.
I hope this explanation is clear enough to understand what I am trying to achieve and if anyone has any solutions to this problem it would be really appreciated.
Your posted M Code doesn't really return the results you show, but here is an example of how to use undefined column headers in your custom column.
It involves adding an Index column to sort things out.
Note that I also replaced nulls which were in my sample data, as the Text.Split function will fail with a null
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
allCols=Table.ColumnNames(Source),
col1 = allCols{0},
col2 = allCols{1},
col3 = allCols{2},
IDX = Table.AddIndexColumn(Source,"IDX",0,1),
#"Replaced Value" = Table.ReplaceValue(IDX,null,"",Replacer.ReplaceValue,allCols),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "lists", each let
col2List = Text.Split(
Table.Column(#"Replaced Value",col2){[IDX]},
"#(lf)"),
col3List = Text.Split(
Text.From(Table.Column(#"Replaced Value",col3){[IDX]}),
"#(lf)")
in
List.Zip({col2List,col3List}))
Edit
Here is an example of M-Code that is insensitive to the actual column names, and will produce the output you show from the input you show:
let
Source = Excel.CurrentWorkbook(){[Name="Table10"]}[Content],
allCols=Table.ColumnNames(Source),
col1 = allCols{0},
col2 = allCols{1},
col3 = allCols{2},
IDX = Table.AddIndexColumn(Source,"IDX",0,1),
#"Added Custom" = Table.AddColumn(IDX, "Custom", each if Table.Column(IDX,col2){[IDX]} = "-"
then
List.Zip({{Table.Column(IDX,col1){[IDX]}},{Text.From(Table.Column(IDX,col3){[IDX]})}})
else
List.InsertRange(
List.Zip({
Text.Split(Table.Column(IDX,col2){[IDX]},"#(lf)"),
Text.Split(Table.Column(IDX,col3){[IDX]},"#(lf)")}),
0,{{Table.Column(IDX,col1){[IDX]},"-"}})),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",List.Combine({{"IDX"}, allCols})),
#"Expanded Custom" = Table.ExpandListColumn(#"Removed Columns", "Custom"),
#"Extracted Values" = Table.TransformColumns(#"Expanded Custom",
{"Custom", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Custom",
Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {col1, col3}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{col1, type text}, {col3, type text}})
in
#"Changed Type"
Similar to a beginners question I posted: Split values in cell into columns and rows
When trying to achieve the same affect for multiple columns, power query editor can split one column as desired but for the other column copies all of the values to the split into each new row (as in the image). This makes sense however im wondering if its possible to split the data accordingly as shown in the desired outcome.
I have found a work around to this by repeating the PQE exercise twice for each column to the split and then moving the outputted columns so that they are adjacent. However this seems like an inefficient way to achieve this. Can power query split both columns as desired without having to do this twice?
I would suggest first combining the columns; then doing the split.
But when you combine the columns, you need to do this on a row-by-row basis to keep things together on the same line.
A list of each cell contents can be created with the Text.Split function.
Then the two lists can be combined using the List.Zip function.
Finally, we just split them up.
I use a Custom Column to create the joined lists. You can see the formula by clicking on the Added Custom applied step.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Material", type text}, {"Sub", type text}, {"CAS", type text}}),
//combine the two columns
#"Added Custom" = Table.AddColumn(#"Changed Type", "list", each List.Zip({
Text.Split([Sub],"#(lf)"),
Text.Split([CAS],"#(lf)")
})),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Sub", "CAS"}),
//Expand the list and split into rows
#"Expanded list" = Table.ExpandListColumn(#"Removed Columns", "list"),
#"Extracted Values" = Table.TransformColumns(#"Expanded list", {"list", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "list", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"list.1", "list.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"list.1", type text}, {"list.2", type text}}),
//Rename the splitted columns
renamed = Table.RenameColumns(#"Changed Type1",List.Zip({Table.ColumnNames(#"Changed Type1"),Table.ColumnNames(Source)}))
in
renamed
try below
The key is in the added custom columns that split on linefeed into lists, and then combine those lists into a table that can be expanded into rows. To make null handling easier I converted nulls to a text null, then back at end
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Sub", type text}, {"CAS", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type",null,"[null]",Replacer.ReplaceValue,{"Sub", "CAS"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each Text.Split([Sub],"#(lf)")),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Text.Split([CAS],"#(lf)")),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom.2", each Table.FromColumns({[Custom],[Custom.1]})),
#"Expanded Custom.2" = Table.ExpandTableColumn(#"Added Custom2", "Custom.2", {"Column1", "Column2"}, {"Column1", "Column2"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom.2",{"Sub", "CAS", "Custom", "Custom.1"}),
#"Replaced Value1" = Table.ReplaceValue(#"Removed Columns","[null]",null,Replacer.ReplaceValue,{"Column1", "Column2"})
in #"Replaced Value1"
Maybe this is a very simple question, but I'm trying to figure out how to do this, as I have hundreds of columns and the idea of doing it by hand, splitting them into separate queries and then append them doesn't seem to be very practical.
I've been working on a query and it returns me values in the following format:
Date | Time | Value | Time | Value...
A | B | C | D | E...
But I need to transform it to look like:
Date | Time | Value
A | B | C
A | D | E
Thanks for the help!
Using no custom code:
Load data into powerquery using Data ... From Table/Range...
Right-click Date column, choose unpivot other columns
Add column... index column... use default column name Index
Add column...Custom Column... with formula =Number.Mod([Index],2) and default name Custom
This converts the index column into alternating 0/1s
(Assuming your 2nd column is named Value.1) Add column...Custom Column... with formula =#"Added Custom"{[Index]+1}[Value.1] and default name Custom.1
That will place the value from the row below the current one into current row
Remove alternating row by clicking arrow next to Custom column and removing [x] next the the 1
Click-Select the Attribute, Index and Custom columns, right-click Remove Columns
Load and Close
Assuming your data is loaded as range Table1 you could use this code, pasted into Home...Advanced...
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Date"}, "Attribute", "Value.1"),
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each Number.Mod([Index],2)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each #"Added Custom"{[Index]+1}[Value.1]),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Custom] = 0)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Attribute", "Index", "Custom"})
in #"Removed Columns"
If you are willing to use some custom code, this creates two tables, one table with odd columns and one with even columns, unpivots each of them, adds an index to both, then merges them back on that index. Works for any number of columns, might be faster than above for larger data sets.
Assuming your data is loaded as range Table1 you could use this code, pasted into Home...Advanced...
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
OddUnpivot= Table.AddIndexColumn(Table.UnpivotOtherColumns(Table.RemoveColumns(Source,List.RemoveFirstN(List.Alternate(Table.ColumnNames(Source),1,1,1),1)), {"date"}, "Attribute", "Value"), "Index", 0, 1),
EvenUnpivot= Table.AddIndexColumn(Table.UnpivotOtherColumns(Table.RemoveColumns(Source,List.Alternate(Table.ColumnNames(Source),1,1)), {"date"}, "Attribute", "Value"), "Index", 0, 1),
#"Merged Queries" = Table.NestedJoin(OddUnpivot,{"Index"},EvenUnpivot,{"Index"},"Table2",JoinKind.LeftOuter),
#"Expanded Table" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Value"}, {"Value.1"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table",{"Attribute", "Index"})
in #"Removed Columns"
LATER UPDATE:
More generically, I've decided I like this method better
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// 1 base columns, then groups of 2 columns, stack them
Combo = List.Transform(List.Split(List.Skip(Table.ColumnNames(Source),1),2), each List.FirstN(Table.ColumnNames(Source),1) & _),
#"Added Custom" =List.Accumulate(
Combo,
#table({"Column1"}, {}),
(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1)
)
in #"Added Custom"