Power Query - Variable Column and Header location from 30+ workbooks - excel

I am trying to combine many workbooks with multiple sheets. The issue is on sheet 1 there is a large information header prior to the information needed to extract. As well as many merged cells that return a large number of nulls and push data into variable columns depending on the date and version of the source workbooks.
Currently sorting and promoting headers allows me to match up the first Two Columns of information needed but subsequent info is pushed right into other fields.
Is there a way to delete nulls and shift the data sets left to match fields? Or better yet identify dynamic header changes and return data to match the selected headers?
Below is an outline of the issue, unfortunately cleaning the data on the amount of sheets and workbooks isn't really an option. I'm fairly new to Power Query and can't seem to figure this one out.
c1 c2 c3 c4 c5 c6 c7
A B Null C D Null E
a b c D Null E Null
A B C Null D G E
Need A-B-C-D-E only.
= () => let
Source = Folder.Files("C:\Users\XXXXXXXX\Desktop\Log"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from Log", each #"Transform File from Log"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from Log"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from Log", Table.ColumnNames(#"Transform File from Log"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Name", type text}, {"Data", type any}, {"Item", type text}, {"Kind", type text}, {"Hidden", type logical}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Data", "Name", "Source.Name"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([Name] = "page 1" or [Name] = "page 2" or [Name] = "page 2 +" or [Name] = "page 3 +" or [Name] = "page 4 +" or [Name] = "page 5 +" or [Name] = "page 6 +" or [Name] = "page 7 +" or [Name] = "page 8 +")),
#"Reordered Columns" = Table.ReorderColumns(#"Filtered Rows",{"Source.Name", "Name", "Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Reordered Columns", "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column40", "Column41", "Column42", "Column43", "Column44", "Column45", "Column46", "Column47", "Column48", "Column49", "Column50", "Column51", "Column52", "Column53", "Column54", "Column55", "Column56", "Column57", "Column58", "Column59", "Column60", "Column61", "Column62", "Column63", "Column64", "Column65", "Column66", "Column67", "Column68", "Column69", "Column70", "Column71", "Column72", "Column73", "Column74", "Column75", "Column76", "Column77", "Column78", "Column79", "Column80", "Column81", "Column82", "Column83", "Column84"}, {"Data.Column1", "Data.Column2", "Data.Column3", "Data.Column4", "Data.Column5", "Data.Column6", "Data.Column7", "Data.Column8", "Data.Column9", "Data.Column10", "Data.Column11", "Data.Column12", "Data.Column13", "Data.Column14", "Data.Column15", "Data.Column16", "Data.Column17", "Data.Column18", "Data.Column19", "Data.Column20", "Data.Column21", "Data.Column22", "Data.Column23", "Data.Column24", "Data.Column25", "Data.Column26", "Data.Column27", "Data.Column28", "Data.Column29", "Data.Column30", "Data.Column31", "Data.Column32", "Data.Column33", "Data.Column34", "Data.Column35", "Data.Column36", "Data.Column37", "Data.Column38", "Data.Column39", "Data.Column40", "Data.Column41", "Data.Column42", "Data.Column43", "Data.Column44", "Data.Column45", "Data.Column46", "Data.Column47", "Data.Column48", "Data.Column49", "Data.Column50", "Data.Column51", "Data.Column52", "Data.Column53", "Data.Column54", "Data.Column55", "Data.Column56", "Data.Column57", "Data.Column58", "Data.Column59", "Data.Column60", "Data.Column61", "Data.Column62", "Data.Column63", "Data.Column64", "Data.Column65", "Data.Column66", "Data.Column67", "Data.Column68", "Data.Column69", "Data.Column70", "Data.Column71", "Data.Column72", "Data.Column73", "Data.Column74", "Data.Column75", "Data.Column76", "Data.Column77", "Data.Column78", "Data.Column79", "Data.Column80", "Data.Column81", "Data.Column82", "Data.Column83", "Data.Column84"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Data", each ([Data.Column2] <> null and [Data.Column2] <> 16 and [Data.Column2] <> "16" and [Data.Column2] <> "LOCATION")),
#"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows1", [PromoteAllScalars=true])
in
#"Promoted Headers"
Picture

To get rid of nulls and slide everything to the left
add column .. index column
right click index column, unpivot other columns
right click and remove attribute column
Group on Index and add another index in each group by modifying the code to end with
each Table.AddIndexColumn(_, "Index2", 1, 1), type table}})
Expand the column using arrows atop, for the [x]values and [x] index2 fields
Click the Index2 field and transform .. pivot column, with Value as Values, advanced, do not aggregate
Sample code for transforming above BEFORE table to AFTER table
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Index"}, {{"GRP", each Table.AddIndexColumn(_, "Index2", 1, 1), type table}}),
#"Expanded GRP" = Table.ExpandTableColumn(#"Grouped Rows", "GRP", {"Value", "Index2"}, {"Value", "Index2"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Expanded GRP", {{"Index2", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Expanded GRP", {{"Index2", type text}}, "en-US")[Index2]), "Index2", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns1"

Related

A better way to extract Subheading numbers using power query

I am attempting to extract section heading numbers from a column in excel using power query.
I have already achieved this by matching with an existing list. However, I wonder if there is a better way to achieve this in fewer steps.
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
#"Trimmed Text1" = Table.TransformColumns(Source,{{"Column1", PowerTrim, type text}}),
SectionNumbers = Table.AddColumn(#"Trimmed Text1", "SectionNumber", (x) => Text.Combine(Table.SelectRows(SectionNumbers, each Text.Contains(x[Column1],[SectionNumbers], Comparer.OrdinalIgnoreCase))[SectionNumbers],", ")),
#"Split Column by Delimiter2" = Table.SplitColumn(SectionNumbers, "SectionNumber", Splitter.SplitTextByEachDelimiter({","}, QuoteStyle.None, true), {"SectionNumber.1", "SectionNumber.2"}),
#"Added Custom1" = Table.AddColumn(#"Split Column by Delimiter2", "Custom", each if [SectionNumber.2] = null then [SectionNumber.1] else [SectionNumber.2]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Column1", "Custom"})
in
#"Removed Other Columns"
The Section numbers being matched to can be generated using:
SectionNumbers
let
Source = {1..16},
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each {1..9}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom.1", each "."),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Added Custom1", {{"Custom", type text}}, "en-GB"),{"Custom", "Custom.1"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Merged Columns", {{"Column1", type text}}, "en-GB"),{"Column1", "Merged"},Combiner.CombineTextByDelimiter(".", QuoteStyle.None),"SectionNumbers")
in
#"Merged Columns1"
Essentially I would like a way of extracting any decimal at the start of a cell, either 15.0. or 15.0, or even 15.0.1 etc.
I have considered using regex i.e. \d+\.\d+[.] which should work however I have many rows and find that regex sometimes is computationally intensive in this case, so it takes much longer to load than the Above M Code.
Another power query method
Since you know your section numbers you could:
Generate a (buffered) list of all the section numbers
see if the first space-separated part of the string in column 1 exists in the Section Number list.
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//create list of all serial numbers
SerialNumbers = List.Buffer(
let
Part1 = List.Transform({1..16}, each Text.From(_) & "."),
Part2 =List.Transform({1..10}, each Text.From(_) & "."),
sn = List.Accumulate(Part1,{}, (state, current)=> state &
List.Generate(
()=>[s=current & Part2{0}, idx=0],
each [idx] < List.Count(Part2),
each [s=current & Part2{[idx]+1}, idx=[idx]+1],
each [s]))
in
sn),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each
let
x = Text.Split([Column1]," "){0}
in
if List.Contains(SerialNumbers,x) then x else null, type text)
in
#"Added Custom"
How about
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each if Text.PositionOfAny([Column1], {"0".."9"})>=0 then Text.BeforeDelimiter(Text.From([Column1])," ") else null)
in #"Added Custom"

Power Query: Expression.Error: There weren't enough elements in the enumeration to complete the operation. (LIST)

What I am trying to achieve is to obtain "matches/pairs" from two tables. One (source 1)is data table with Date/Time and Pressure value columns and the other (source 2) is like Date/Time and Info value Columns. Second table has so called "pairs" , start and stop in certain time. I want to get exact matches when is found in source 1 or approximate match when is not exact as in source 1 (seconds can be a problem).
Lets say you are matching/lookup two tables, give me everything that falls between for instance 15.01.2022 06:00:00 and 15.01.2022 09:15:29.
Where I have a problem is more likely exact match and seconds. It is skipping or cant find any pair if the seconds are not matching. So my question is how to make if not seconds then lookup for next availablee match, can be a minute too as long as they are in the given range (start stop instances).
That is a reason I am getting this Expression error. Or is there a way to skip that error and proceed with Query??
Link to download the data:
https://docs.google.com/spreadsheets/d/1Jv5j7htAaEFktN0ntwOZCV9jesF43tEP/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
On the code below is what I am trying to do:
let
//Be sure to change the table names in the Source= and Source2= lines to be the actual table names from your workbook
Source = Excel.CurrentWorkbook(){[Name="Parameters"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = Excel.CurrentWorkbook(){[Name="Log_Original"]}[Content],
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "NextLineStart", each if Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) = true
and Text.Contains(#"Added Index"[Info]{[Index]+1},"start",Comparer.OrdinalIgnoreCase) = true
then "delete"
else null),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([NextLineStart] = null)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Index", "NextLineStart"}),
//create a list of all the relevant start/stop times
filterTimes = List.Combine(
List.Generate(
()=> [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{0},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{1}-#"Removed Columns1"[#"Date/Time"]{0})+1,
#duration(0,0,0,1)), IDX = 0],
each [IDX] < Table.RowCount(#"Removed Columns1"),
each [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{[IDX]+2},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{[IDX]+3}-#"Removed Columns1"[#"Date/Time"]{[IDX]+2})+1,
#duration(0,0,0,1)), IDX = [IDX]+2],
each [times]
)
),
//filter the table using the list
filterTimesCol = Table.FromList(filterTimes,Splitter.SplitByNothing()),
filteredTable = Table.Join(#"Changed Type","Date/Time",filterTimesCol,"Column1",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(filteredTable,{"Column1"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each DateTime.ToText([#"Date/Time"],"dd-MMM-yy")),
#"Filtered Rows2" = Table.SelectRows(#"Added Custom1", each [#"Date/Time"] > #datetime(2019, 01, 01, 0, 0, 0)),
#"Sorted Rows" = Table.Sort(#"Filtered Rows2",{{"Date/Time", Order.Ascending}})
in
#"Sorted Rows"
I set up the below to return a sorted table with all results between the start and ending date/times. You can then select the first or middle or bottom row of each table if you want from this point. Its hard to tell from your question if you are looking for the value closest to the start value, closest to the end value or something inbetween. You can wrap my Table.Sort with a Table.FirstN or Table.LastN to pick up the first or last row.
I left most of your starting code alone
let Source = Table.Buffer(T1),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = T2,
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
// shift Info up one row for comparison
shiftedList = List.RemoveFirstN( #"Added Index"[Info],1),
custom1 = Table.ToColumns( #"Added Index") & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames( #"Added Index") & {"NextInfo"}),
#"Filtered Rows2" = Table.SelectRows(custom2, each not (Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) and Text.Contains([NextInfo],"start", Comparer.OrdinalIgnoreCase))),
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Type", each if Text.Contains(Text.Lower([Info]),"start") then "start" else if Text.Contains(Text.Lower([Info]),"finished") then "finished" else null),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom3",{"Info", "NextInfo"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns2", "Custom", each if [Type]="start" then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Type]), "Type", "Date/Time"),
#"Added Custom2" = Table.AddColumn(#"Pivoted Column","Table",(i)=>Table.Sort(Table.SelectRows(T1, each [#"Date/Time"]>=i[start] and [#"Date/Time"]<=i[finished]),{{"Date/Time", Order.Ascending}}) , type table )
in #"Added Custom2"

Power Query - Group based on two filters

I'm working on a simplification of some of my reports by using Power Query.
I have a table with this structure:
In the first column we can see the Sales Order Number.
In the second column we can see the Message type. If it is a XR or PR Message.
And in the last column we can find the Key for the status:
A = Active
B = Active
C = Closed
With this logic I would like to get this result table:
This way I can see as a cross table the type and the state per sales order.
How is this possible by using Power Query?
Are you sure about your results for Sales Order 103?
Try the following M code, amending where necessary (Source step, for example):
let
Source = Excel.CurrentWorkbook(){[Name = "Table1"]}[Content],
XRCol = Table.AddColumn(
Source,
"XR",
each Text.BeforeDelimiter([Message], "-") = "XR",
type logical
),
PRCol = Table.AddColumn(
XRCol,
"PR",
each Text.BeforeDelimiter([Message], "-") = "PR",
type logical
),
XRActCol = Table.AddColumn(PRCol, "XR Active", each [XR] and [Key] <> "C", type logical),
PRActCol = Table.AddColumn(XRActCol, "PR Active", each [PR] and [Key] <> "C", type logical),
RemoveCols = Table.RemoveColumns(PRActCol, {"Message", "Key"}),
Group = Table.Group(
RemoveCols,
{"Sales Order"},
{
{"XR", each List.Max([XR]), type logical},
{"PR", each List.Max([PR]), type logical},
{"XR Active", each List.Max([XR Active]), type logical},
{"PR Active", each List.Max([PR Active]), type logical}
}
)
in
Group
If I see it correctly your output PR and XR is mixed up in Sales order 103.
Try if this works for you:
let
Source = Excel.Workbook(File.Contents("C:\Users\maitting\Documents\Mappe1.xlsx"), null, true),
Tabelle1_Sheet = Source{[Item="Tabelle1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Tabelle1_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Sales Order", Int64.Type}, {"Message", type text}, {"Key", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Message", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"Message"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Message", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type1","A","Active",Replacer.ReplaceText,{"Key"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","B","Active",Replacer.ReplaceText,{"Key"}),
#"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","C","Closed",Replacer.ReplaceText,{"Key"}),
#"Merged Columns" = Table.CombineColumns(#"Replaced Value2",{"Message", "Key"},Combiner.CombineTextByDelimiter(" - ", QuoteStyle.None),"Merged"),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Merged Columns", {"Sales Order"}, "Attribute", "Value"),
#"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[Value]), "Value", "Attribute", List.Count),
#"Added Conditional Column" = Table.AddColumn(#"Pivoted Column", "XR",
each if [#"XR - Active"] = 1 then 1
else if [#"XR - Closed"] = 1 then 1
else 0),
#"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "PR",
each if [#"PR - Active"] = 1 then 1
else if [#"PR - Closed"] = 1 then 1
else 0),
#"Removed Columns" = Table.RemoveColumns(#"Added Conditional Column1",{"XR - Closed", "PR - Closed"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Sales Order", "XR", "PR", "XR - Active", "PR - Active"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Reordered Columns",{{"XR", type logical}, {"PR", type logical}, {"XR - Active", type logical}, {"PR - Active", type logical}})
in
#"Changed Type2"
Output:
I've just added Sales order 104 with PR Closed to handle all possible cases.
My version
let Source = Excel.CurrentWorkbook(){[Name = "Table1"]}[Content],
RemoveEndingColumn = Table.TransformColumns(Source,{{"Message", each Text.BeforeDelimiter(_, "-"), type text}}),
#"Create MessageKey" = Table.AddColumn(RemoveEndingColumn, "MessageKey", each [Message]&" "&[Key]),
StackColumns = Table.RenameColumns(Table.SelectColumns(#"Create MessageKey",{"Sales Order", "Message"}),{{"Message", "Column"}})& Table.RenameColumns(Table.SelectColumns(#"Create MessageKey",{"Sales Order", "MessageKey"}),{{"MessageKey", "Column"}}),
#"Added Custom1" = Table.AddColumn(StackColumns, "TrueFalse", each true, type logical),
#"Pivoted Column" = Table.Pivot(#"Added Custom1", List.Distinct(#"Added Custom1"[Column]), "Column", "TrueFalse")
in #"Pivoted Column"

Excel Power Query - Count number of matching multiple columns

I have a datasource from an external Excel file that I have added to an Excel worksheet. I need to add new custom columns that compare the data to a table ("My_Table") in another worksheet that is manually updated. I used the Power Query Editor and created a new column that checks if there is a matching entry in My_Table based on matching 3 columns and gives a True/False result (ie for each row of the datasource, if the acctName, projectName, and boardName match a corresponding row in My_Table, then it returns true):
#"Added Custom" = Table.AddColumn(#"Reordered Columns", "Tracked", each Table.Contains( My_Table, [Customer=[acctName], Project=[projectName], Board=[boardName]]))
What I would like to do now is do the exact same thing but count how many times those three columns match in "My_Table". I thought Tabel.RowCount would work but I'm not sure if that's the right way to do it as I either have an error or a zero result.
dolomike, Here's another shot at it...
I started with this as Table1:
...and this as My_Table:
...and used this M code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Merged Queries" = Table.NestedJoin(Source, {"acctName", "projectName", "boardName"}, My_Table, {"Customer", "Project", "Board"}, "My_Table", JoinKind.LeftOuter),
#"Expanded My_Table" = Table.ExpandTableColumn(#"Merged Queries", "My_Table", {"Customer", "Project", "Board"}, {"My_Table.Customer", "My_Table.Project", "My_Table.Board"}),
#"Grouped Rows" = Table.Group(#"Expanded My_Table", {"My_Table.Customer", "My_Table.Project", "My_Table.Board"}, {{"Count", each Table.RowCount(_), type number}, {"AllData", each _, type table [acctName=text, projectName=text, boardName=text, My_Table.Customer=text, My_Table.Project=text, My_Table.Board=text]}}),
Custom2 = Table.TransformColumns(#"Grouped Rows",{"Count", each if _ = List.Max(#"Grouped Rows"[Count]) then 0 else _}),
#"Removed Other Columns" = Table.SelectColumns(Custom2,{"Count", "AllData"}),
#"Expanded AllData" = Table.ExpandTableColumn(#"Removed Other Columns", "AllData", {"acctName", "projectName", "boardName", "My_Table.Customer", "My_Table.Project", "My_Table.Board"}, {"acctName", "projectName", "boardName", "My_Table.Customer", "My_Table.Project", "My_Table.Board"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Expanded AllData",{"Count", "acctName", "projectName", "boardName"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns1",{"acctName", "projectName", "boardName", "Count"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"acctName", "Customer"}, {"projectName", "Project"}, {"boardName", "Board"}}),
#"Removed Duplicates" = Table.Distinct(#"Renamed Columns")
in
#"Removed Duplicates"
...to get this result:

Feed cell value into excel query web browser URL

My problem:
Through New Query -> From Other Sources -> From Web, I entered a static URL that allowed me to load approximately 60k "IDs" from a webpage in JSON format.
I believe each of these IDs corresponds to an item.
So they're all loaded and organised in a column, with one ID per line, inside a Query tab.
For the moment, no problem.
Now I need to import information from a dynamic URL that depends on the ID.
So I need to import from URL in this form:
http://www.example.com/xxx/xxxx/ID
This imports the following for each ID:
name of correspond item,
average price,
supply,
demand,
etc.
After research I came to the conclusion that I had to use the "Advanced Editor" inside the query editor to reference the ID query tab.
However I have no idea how to put together the static part with the ID, and how to repeat that over the 60k lines.
I tried this:
let
Source = Json.Document(Web.Contents("https://example.com/xx/xxxx/" & ID)),
name1 = Source[name]
in
name1
This returns an error.
I think it's because I can't add a string and a column.
Question: How do I reference the value of the cell I'm interested in and add it to my string ?
Question: Is what I'm doing viable?
Question: How is Excel going to handle loading 60k queries?
Each query is only a few words to import.
Question: Is it possible to load information from 60k different URLs with one query?
EDIT : thank you very much for answer Alexis, was very helpful. So to avoid copying what you posted I did it without the function (tell me what you think of it) :
let
Source = Json.Document(Web.Contents("https://example.com/all-ID.json")),
items1 = Source[items],
#"Converted to Table" = Table.FromList(items1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "ID"}}),
#"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "URL", each Text.Combine({"http://example.com/api/item/", Text.From([ID], "fr-FR")}), type text),
#"Added Custom" = Table.AddColumn(#"Inserted Merged Column", "Item", each Json.Document(Web.Contents([URL]))),
#"Expanded Item" = Table.ExpandRecordColumn(#"Added Custom", "Item", {"name"}, {"Item.name"})
in
#"Expanded Item"
Now the problem I have is that it takes ages to load up all the information I need from all the URLs.
As it turns out it's possible to extract from multiple IDs at once using this format : http://example.com/api/item/ID1,ID2,ID3,ID4,...,IDN
I presume that trying to load from an URL containing all of the IDs at once would not work out because the URL would contain way too many characters to handle.
So to speed things up, what I'm trying to do now is concatenate every Nth row into one cell, for example with N=3 :
205
651
320165
63156
4645
31
6351
561
561
31
35
would become :
205, 651, 320165
63156, 4645, 31
6351, 561, 561
31, 35
The "Group by" functionnality doesn't seem to be what I'm looking for, and I'm not sure how to automatise that throught Power Query
EDIT 2
So after a lot of testing I found a solution, even though it might not be the most elegant and optimal :
I created an index with a 1 step
I created another costum column, I associated every N rows with an N increasing number
I used "Group By" -> "All Rows" to create a "Count" column
Created a costum column "[Count][ID]
Finally I excracted values from that column and put a "," separator
Here's the code for N = 10 000 :
let
Source = Json.Document(Web.Contents("https://example.com/items.json")),
items1 = Source[items],
#"Converted to Table" = Table.FromList(items1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "ID"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"ID", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Added Conditional Column" = Table.AddColumn(#"Added Index", "Custom", each if Number.RoundDown([Index]/10000) = [Index]/10000 then [Index] else Number.IntegerDivide([Index],10000)*10000),
#"Reordered Columns" = Table.ReorderColumns(#"Added Conditional Column",{"Index", "ID", "Custom"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns", {"Custom"}, {{"Count", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom.1", each [Count][ID]),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom.1", each Text.Combine(List.Transform(_, Text.From), ","), type text})
in
#"Extracted Values"
I think what you want to do here is create a custom function that you invoke with each of your ID values.
Let me give a similar example that should point you in the right direction.
Let's say I have a table named ListIDs which looks like this:
ID
----
1
2
3
4
5
6
7
8
9
10
and for each ID I want to pull some information from Wikipedia (e.g. for ID = 6 I want to lookup https://en.wikipedia.org/wiki/6 and return the Cardinal, Ordinal, Factorization, and Divisors of 6).
To get this for just one ID value my query would look like this (using 6 again):
let
Source = Web.Page(Web.Contents("https://en.wikipedia.org/wiki/6")),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Column2] = "Cardinal" or [Column2] = "Divisors" or [Column2] = "Factorization" or [Column2] = "Ordinal")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Column1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "Property"}, {"Column3", "Value"}}),
#"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Property]), "Property", "Value")
in
#"Pivoted Column"
Now we want to convert this into a function so that we can use it as many times as we want without creating a bunch of queries. (Note: I've named this query/function WikiLookUp as well.) To do this, change it to the following:
let
WikiLookUp = (ID as text) =>
let
Source = Web.Page(Web.Contents("https://en.wikipedia.org/wiki/" & ID)),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Column2] = "Cardinal" or [Column2] = "Divisors" or [Column2] = "Factorization" or [Column2] = "Ordinal")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Column1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "Property"}, {"Column3", "Value"}}),
#"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Property]), "Property", "Value")
in
#"Pivoted Column"
in
WikiLookUp
Notice that all we did is wrap it in another set of let...in and defined the parameter ID = text which gets substituted into the Source line near the end. The function should appear like this:
Now we can go back to our table which we've imported into the query editor and invoke our newly created function in a custom column. (Note: Make sure you convert your ID values to text type first since they're being appended to a URL.)
Add a custom column with the following definition (or use the Invoke Custom Function button)
= WikiLookUp([ID])
Expand that column to bring in all the columns you want and you're done!
Here's what that query's M code looks like:
let
Source = Excel.CurrentWorkbook(){[Name="ListIDs"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each WikiLookUp([ID])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Cardinal", "Ordinal", "Factorization", "Divisors"}, {"Cardinal", "Ordinal", "Factorization", "Divisors"})
in
#"Expanded Custom"
The query should look like this:

Resources