I have a spreadsheet that looks like this:
I would like it to look like this at the end:
I know that I can use PowerQuery to delimit and expand into rows, but that would only work on the first column. The second time doing that would introduce lots of duplicates.
Any help?
I'd just do them separately and combine
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Removed Columns" = Table.RemoveColumns(Source,{"People"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Emails] <> null)),
SCBD = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows", {{"Emails", Splitter.SplitTextByDelimiter(";", QuoteStyle.None)}}), "Emails"),
#"Removed Columns1" = Table.RemoveColumns(Source,{"Emails"}),
#"Filtered Rows1" = Table.SelectRows(#"Removed Columns1", each ([People] <> null)),
SCBD1 = Table.ExpandListColumn(Table.TransformColumns(#"Filtered Rows1", {{"People", Splitter.SplitTextByDelimiter(";", QuoteStyle.None)}}), "People"),
Combined = SCBD & SCBD1,
#"Sorted Rows" = Table.Sort(Combined,{{"Name", Order.Ascending}, {"Emails", Order.Descending}})
in #"Sorted Rows"
You lose original order with this approach (you could attempt to re-sort at the end if you wanted).
If at some point you get more columns that need to be similarly split, you can just add to the columnsToSplit list.
let
dataFromSheet = Table.FromColumns({{"Cereal Killers", "Acme Products", "Arkham Asylum"}, {"123 Sugar Way", "345 Whoville Place", "Gotham City"}, {"abc#def.com; xyz#abc.com; jkl#mno.com", null, "the.scarecrow#nolan.com; jokesonyou#joker.com"}, {"Tony Tiger; Toucan Sam; Lucky Leprauchan", "W. Coyote; R. Runner; R. Rabbit", null}}, type table [Name=text, Address=text, Emails=nullable text, People=nullable text]),
columnsToSplit = {"Emails","People"},
loopOverColumnsToSplit = List.Accumulate(columnsToSplit, #table({}, {}), (tableState, currentColumn) =>
let
reduceColumns = Table.SelectColumns(dataFromSheet, {"Name", "Address"} & {currentColumn}),
dropNullRows = Table.SelectRows(reduceColumns, each Record.Field(_, currentColumn) <> null),
splitIntoList = Table.TransformColumns(dropNullRows, {{currentColumn, each Text.Split(_, "; "), type list}}),
expandList = Table.ExpandListColumn(splitIntoList, currentColumn),
appendToAccumulatedTable = tableState & expandList
in appendToAccumulatedTable
)
in
loopOverColumnsToSplit
If preserving order is important, then maybe try approach below (which might take a bit longer as it has a few extra steps).
let
dataFromSheet = Table.FromColumns({{"Cereal Killers", "Acme Products", "Arkham Asylum"}, {"123 Sugar Way", "345 Whoville Place", "Gotham City"}, {"abc#def.com; xyz#abc.com; jkl#mno.com", null, "the.scarecrow#nolan.com; jokesonyou#joker.com"}, {"Tony Tiger; Toucan Sam; Lucky Leprauchan", "W. Coyote; R. Runner; R. Rabbit", null}}, type table [Name=text, Address=text, Emails=nullable text, People=nullable text]),
columnsToSplit = {"Emails","People"},
numberOfColumnsToSplit = List.Count(columnsToSplit),
loopOverColumnsToSplit = List.Accumulate(List.Positions(columnsToSplit), #table({}, {}), (tableState, currentIndex) =>
let
currentColumn = columnsToSplit{currentIndex},
reduceColumns = Table.SelectColumns(dataFromSheet, {"Name", "Address"} & {currentColumn}),
dropNullRows = Table.SelectRows(reduceColumns, each Record.Field(_, currentColumn) <> null),
addIndex = Table.AddIndexColumn(dropNullRows, "toSortBy", currentIndex, numberOfColumnsToSplit),
splitIntoList = Table.TransformColumns(addIndex, {{currentColumn, each Text.Split(_, "; "), type list}}),
expandList = Table.ExpandListColumn(splitIntoList, currentColumn),
appendToAccumulatedTable = tableState & expandList
in appendToAccumulatedTable
),
sorted = Table.Sort(loopOverColumnsToSplit, {"toSortBy", Order.Ascending}),
dropHelperColumn = Table.RemoveColumns(sorted, {"toSortBy"})
in
dropHelperColumn
Just to clarify, if you have a row where the values in Emails and People columns are both null, then that row will not be present in the output table.
Related
I am attempting to extract section heading numbers from a column in excel using power query.
I have already achieved this by matching with an existing list. However, I wonder if there is a better way to achieve this in fewer steps.
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
#"Trimmed Text1" = Table.TransformColumns(Source,{{"Column1", PowerTrim, type text}}),
SectionNumbers = Table.AddColumn(#"Trimmed Text1", "SectionNumber", (x) => Text.Combine(Table.SelectRows(SectionNumbers, each Text.Contains(x[Column1],[SectionNumbers], Comparer.OrdinalIgnoreCase))[SectionNumbers],", ")),
#"Split Column by Delimiter2" = Table.SplitColumn(SectionNumbers, "SectionNumber", Splitter.SplitTextByEachDelimiter({","}, QuoteStyle.None, true), {"SectionNumber.1", "SectionNumber.2"}),
#"Added Custom1" = Table.AddColumn(#"Split Column by Delimiter2", "Custom", each if [SectionNumber.2] = null then [SectionNumber.1] else [SectionNumber.2]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Column1", "Custom"})
in
#"Removed Other Columns"
The Section numbers being matched to can be generated using:
SectionNumbers
let
Source = {1..16},
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each {1..9}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom.1", each "."),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Added Custom1", {{"Custom", type text}}, "en-GB"),{"Custom", "Custom.1"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Merged Columns", {{"Column1", type text}}, "en-GB"),{"Column1", "Merged"},Combiner.CombineTextByDelimiter(".", QuoteStyle.None),"SectionNumbers")
in
#"Merged Columns1"
Essentially I would like a way of extracting any decimal at the start of a cell, either 15.0. or 15.0, or even 15.0.1 etc.
I have considered using regex i.e. \d+\.\d+[.] which should work however I have many rows and find that regex sometimes is computationally intensive in this case, so it takes much longer to load than the Above M Code.
Another power query method
Since you know your section numbers you could:
Generate a (buffered) list of all the section numbers
see if the first space-separated part of the string in column 1 exists in the Section Number list.
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//create list of all serial numbers
SerialNumbers = List.Buffer(
let
Part1 = List.Transform({1..16}, each Text.From(_) & "."),
Part2 =List.Transform({1..10}, each Text.From(_) & "."),
sn = List.Accumulate(Part1,{}, (state, current)=> state &
List.Generate(
()=>[s=current & Part2{0}, idx=0],
each [idx] < List.Count(Part2),
each [s=current & Part2{[idx]+1}, idx=[idx]+1],
each [s]))
in
sn),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each
let
x = Text.Split([Column1]," "){0}
in
if List.Contains(SerialNumbers,x) then x else null, type text)
in
#"Added Custom"
How about
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each if Text.PositionOfAny([Column1], {"0".."9"})>=0 then Text.BeforeDelimiter(Text.From([Column1])," ") else null)
in #"Added Custom"
Attempting to split decimal numbers in batch using a prevo=ious formula provided on here however the result is an error stating that null or "" or "x" (where is a number) cant be converted to the type list.
The formula:
=try Text.Remove([Column1],Text.ToList(Text.Remove([Column1],{"0".."9","."}))) otherwise null works when applied to a single column however when trying to create a create a table from these columns I get the followings errors:
Desired Output:
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table19"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Table.FromColumns({
(try Text.Remove([Column1],Text.ToList(Text.Remove([Column1],{"0".."9","."}))) otherwise null),
(try Text.Remove([Column2],Text.ToList(Text.Remove([Column2],{"0".."9","."}))) otherwise null)
}))
in
#"Added Custom"
I would like to be able to generate a Table.FromColumns, for n columns which I can then expand. This is just an example and in reality, the number of columns can vary quite a lot.
Update
To better visualise what I am trying to do in power query I wish to create this scenario:
Such that this table can be expanded to:
Probably something obvious but any help appreciated.
I would just
split the columns based on character transition, including the decimal in the character list.
Then Trim the resultant columns to remove any leading/following spaces
Note: Code edited to allow for any number of columns to be split in two. Column names can be dynamic also
let
Source = Excel.CurrentWorkbook(){[Name="Table21"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
//Generate new table from all the columns
//create List of columns
colList = Table.ToColumns(#"Changed Type"),
colNames = Table.ColumnNames(#"Changed Type"),
//convert each column
splitCols = List.Generate(
()=>[colPair=
List.Transform(colList{0},(li)=>
Splitter.SplitTextByCharacterTransition(
{"0".."9","."}, (c) => not List.Contains({"0".."9","."}, c))
(li)),
cn = colNames{0},
idx=0],
each [idx] < List.Count(colList),
each [colPair=
List.Transform(colList{[idx]+1},(li)=>
Splitter.SplitTextByCharacterTransition(
{"0".."9","."}, (c) => not List.Contains({"0".."9","."}, c))
(li)),
cn=colNames{[idx]+1},
idx=[idx]+1],
each List.Zip([colPair]) & {List.Transform({1..2}, (n)=> [cn] & "." & Text.From(n))}),
newCols = List.Combine(List.Transform(splitCols, each List.RemoveLastN(_,1))),
newColNames = List.Combine(List.Transform(splitCols, each List.Last(_))),
newTable = Table.FromColumns(newCols,newColNames),
//trim the excess spaces
trimOps = List.Transform(Table.ColumnNames(newTable), each {_, Text.Trim}),
trimAll = Table.TransformColumns(newTable, trimOps)
in
trimAll
Example with three columns
Again, if you want to retain the original columns in your result table, you need to change three lines in the code:
...
newCols = Table.ToColumns(#"Changed Type") & List.Combine(List.Transform(splitCols, each List.RemoveLastN(_,1))),
newColNames = Table.ColumnNames(#"Changed Type") & List.Combine(List.Transform(splitCols, each List.Last(_))),
newTable = Table.FromColumns(newCols,newColNames),
...
Edited to be usable for multiple columns
let Source =Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Value", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Value.1", "Value.2"}),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"Value.2"}),
#"rename1" = Table.TransformColumns(#"Removed Columns",{{"Attribute", each _&"a", type text}}),
#"Pivoted Column" = Table.RemoveColumns(Table.Pivot(#"rename1", List.Distinct(#"Lowercased Text"[Attribute]), "Attribute", "Value.1"),{"Index"}),
#"Removed Columns2" = Table.RemoveColumns(#"Split Column by Delimiter",{"Value.1"}),
rename = Table.TransformColumns(#"Removed Columns2",{{"Attribute", each _ & "b", type text}}),
#"Pivoted Column1" = Table.RemoveColumns(Table.Pivot(rename, List.Distinct(rename[Attribute]), "Attribute", "Value.2"),{"Index"}),
TFC = Table.FromColumns(Table.ToColumns(Source)&Table.ToColumns(#"Pivoted Column")&Table.ToColumns(#"Pivoted Column1"),Table.ColumnNames(Source)&Table.ColumnNames(#"Pivoted Column")&Table.ColumnNames(#"Pivoted Column1"))
in TFC
I would just duplicate the two original columns (Add Column > Duplicate column) and then split the resulting columns on the left most " " delimiter. No M code needed.
What I am trying to achieve is to obtain "matches/pairs" from two tables. One (source 1)is data table with Date/Time and Pressure value columns and the other (source 2) is like Date/Time and Info value Columns. Second table has so called "pairs" , start and stop in certain time. I want to get exact matches when is found in source 1 or approximate match when is not exact as in source 1 (seconds can be a problem).
Lets say you are matching/lookup two tables, give me everything that falls between for instance 15.01.2022 06:00:00 and 15.01.2022 09:15:29.
Where I have a problem is more likely exact match and seconds. It is skipping or cant find any pair if the seconds are not matching. So my question is how to make if not seconds then lookup for next availablee match, can be a minute too as long as they are in the given range (start stop instances).
That is a reason I am getting this Expression error. Or is there a way to skip that error and proceed with Query??
Link to download the data:
https://docs.google.com/spreadsheets/d/1Jv5j7htAaEFktN0ntwOZCV9jesF43tEP/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
On the code below is what I am trying to do:
let
//Be sure to change the table names in the Source= and Source2= lines to be the actual table names from your workbook
Source = Excel.CurrentWorkbook(){[Name="Parameters"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = Excel.CurrentWorkbook(){[Name="Log_Original"]}[Content],
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "NextLineStart", each if Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) = true
and Text.Contains(#"Added Index"[Info]{[Index]+1},"start",Comparer.OrdinalIgnoreCase) = true
then "delete"
else null),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([NextLineStart] = null)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Index", "NextLineStart"}),
//create a list of all the relevant start/stop times
filterTimes = List.Combine(
List.Generate(
()=> [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{0},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{1}-#"Removed Columns1"[#"Date/Time"]{0})+1,
#duration(0,0,0,1)), IDX = 0],
each [IDX] < Table.RowCount(#"Removed Columns1"),
each [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{[IDX]+2},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{[IDX]+3}-#"Removed Columns1"[#"Date/Time"]{[IDX]+2})+1,
#duration(0,0,0,1)), IDX = [IDX]+2],
each [times]
)
),
//filter the table using the list
filterTimesCol = Table.FromList(filterTimes,Splitter.SplitByNothing()),
filteredTable = Table.Join(#"Changed Type","Date/Time",filterTimesCol,"Column1",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(filteredTable,{"Column1"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each DateTime.ToText([#"Date/Time"],"dd-MMM-yy")),
#"Filtered Rows2" = Table.SelectRows(#"Added Custom1", each [#"Date/Time"] > #datetime(2019, 01, 01, 0, 0, 0)),
#"Sorted Rows" = Table.Sort(#"Filtered Rows2",{{"Date/Time", Order.Ascending}})
in
#"Sorted Rows"
I set up the below to return a sorted table with all results between the start and ending date/times. You can then select the first or middle or bottom row of each table if you want from this point. Its hard to tell from your question if you are looking for the value closest to the start value, closest to the end value or something inbetween. You can wrap my Table.Sort with a Table.FirstN or Table.LastN to pick up the first or last row.
I left most of your starting code alone
let Source = Table.Buffer(T1),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = T2,
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
// shift Info up one row for comparison
shiftedList = List.RemoveFirstN( #"Added Index"[Info],1),
custom1 = Table.ToColumns( #"Added Index") & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames( #"Added Index") & {"NextInfo"}),
#"Filtered Rows2" = Table.SelectRows(custom2, each not (Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) and Text.Contains([NextInfo],"start", Comparer.OrdinalIgnoreCase))),
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Type", each if Text.Contains(Text.Lower([Info]),"start") then "start" else if Text.Contains(Text.Lower([Info]),"finished") then "finished" else null),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom3",{"Info", "NextInfo"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns2", "Custom", each if [Type]="start" then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Type]), "Type", "Date/Time"),
#"Added Custom2" = Table.AddColumn(#"Pivoted Column","Table",(i)=>Table.Sort(Table.SelectRows(T1, each [#"Date/Time"]>=i[start] and [#"Date/Time"]<=i[finished]),{{"Date/Time", Order.Ascending}}) , type table )
in #"Added Custom2"
I need to duplicate multiple columns with PQ, and yes I can do it manually having either 10 steps or 1 entangled step with 10 duplicate column commands.
while I had similar problem with adding columns I used:
#"Added Custom" =
Table.AddColumn(#"Changed Type", "record", each [
Custom = "DTO",
#"Custom.1" = Date.ToText([Accounting Date],"YYYYMMDD"),
#"Custom.2" = "SA",
#"Custom.3" = "DTO " & Date.ToText([Accounting Date], "yyyy.MM"),
#"Custom.4" = null,
#"Custom.5" = null,
#"Custom.6" = null,
#"Custom.7" = null,
#"Custom.8" = null,
#"Custom.9" = if Text.StartsWith([General Ledger Code],"5") then [Cost Center] else null,
#"Custom.10" = null,
#"Custom.11" = null,
#"Custom.12" = null,
#"Custom.13" = if Text.StartsWith([General Ledger Code], "5") then "IF" else null
]),
#"Expanded record" = Table.ExpandRecordColumn(#"Added Custom", "record", {"Custom", "Custom.1", "Custom.2", "Custom.3", "Custom.4", "Custom.5", "Custom.6", "Custom.7", "Custom.8", "Custom.9", "Custom.10", "Custom.11", "Custom.12", "Custom.13"}, {"Custom", "Custom.1", "Custom.2", "Custom.3", "Custom.4", "Custom.5", "Custom.6", "Custom.7", "Custom.8", "Custom.9", "Custom.10", "Custom.11", "Custom.12", "Custom.13"}),
and it worked for adding columns
however I failed while trying to duplicate this with:
#"Duplicated test" = Table.AddColumn(#"Expanded record", "duplicated", each [
#"Custom.14" = Table.DuplicateColumn(#"Expanded record", "Custom.1", "Custom.1 - Copy"),
#"Custom.15" = Table.DuplicateColumn(#"Expanded record", "Custom.3", "Custom.3 - Copy")
]),
#"Expanded duplicated" = Table.ExpandRecordColumn(#"Duplicated test", "duplicated", {"Custom.14", "Custom.15"}, {"Custom.14", "Custom.15"}),
After expanding columns i received my whole table instead of just 2 additional columns.
Is there a way to simplify duplicating columns like adding columns?
Yes, you can simplify duplicating the columns by just using the Table.AddColumn function as you did in the first step, no need for the Table.DuplicateColumn function:
#"Added Custom1" = Table.AddColumn(#"Expanded record", "duplicated", each [
Custom.14 = [Custom.1],
Custom.15 = [Custom.3]
]),
#"Expanded duplicated" = Table.ExpandRecordColumn(#"Added Custom1", "duplicated", {"Custom.14", "Custom.15"}, {"Custom.14", "Custom.15"})
Alternatively, you can use List.Accumulate to make it like you are passing a simple list of column names to the native Table.DuplicateColumn function like this:
table_source = Table.FromRecords({
[CustomerID = 1, Name = "Bob", Phone = "123-4567"],
[CustomerID = 2, Name = "Jim", Phone = "987-6543"],
[CustomerID = 3, Name = "Paul", Phone = "543-7890"],
[CustomerID = 4, Name = "Ringo", Phone = "232-1550"]
}),
table_duplicated = List.Accumulate({"Name", "Phone"}, table_source, (state as table, current as text) =>
Table.DuplicateColumn(state, current, current & " - Copy")
)
Here's what the output looks like:
I need to fit all the values of a column in Power Query into a 1-cell string separated by commas, as the example below:
To do this, I have the following piece of code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Transposed Table" = Table.Transpose(Source),
#"Merged Columns" = Table.CombineColumns(#"Transposed Table",{"Column1", "Column2", "Column3"},Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"Merged"),
#"KeepString" = #"Merged Columns"[Merged]{0}
in
#"KeepString"
The problem with this code is that it assumes there will always be 3 columns, which is not always the case. How can I merge all columns (regardless of how many there are) into one?
You can do this with List.Accumulate:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
KeepString = List.Accumulate(Source[User], "", (state, current) => if state = "" then current else state & "," & current)
in
KeepString
You can also use Table.ColumnNames to get the list of all the column names. You can pass this into Table.CombineColumns, so your modified solution would be:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Transposed Table" = Table.Transpose(Source),
#"Merged Columns" = Table.CombineColumns(#"Transposed Table", Table.ColumnNames(#"Transposed Table"),Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"Merged"),
#"KeepString" = #"Merged Columns"[Merged]{0}
in
#"KeepString"
You can also use a shorter code, like this:
let
Source=Excel.CurrentWorkbook( {[Name="Table1"]}[Content],
Result = Text.Combine(Source[User], ",")
in
Result