Trying to pull data from a SODA API into Excel - excel

The API call looks like this:
https://data.edmonton.ca/resource/3pdp-qp95.json?house_number=10008&street_name=103%20STREET%20NW
and returns data in json:
[{"account_number":"3070208","garage":"N","house_number":"10008","latitude":"53.539158992619","longitude":"-113.497760691896","neighbourhood":"DOWNTOWN","street_name":"103 STREET NW","tax_class":"Non Residential","total_asmt":"1717000"}]
I have an excel table with specific house_number and street_name pairs and I want to capture the total_asmt column for each pair.
I've been able to create a power query which pulls the very first data point into a new sheet:
let
Parameter = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Removed Other Columns" = Table.SelectColumns(Parameter,{"house_number", "street_name"}),
X = #"Removed Other Columns"[house_number]{0},
Y = #"Removed Other Columns"[street_name]{0},
Source = Json.Document(Web.Contents("https://data.edmonton.ca/resource/3pdp-qp95.json?house_number="& X &"&street_name=" & Y)),
in
Source
I can't figure out how to iterate through all the value I have in X and Y or how to capture specific rows from the JSON data. Any help would be appreciated!
Thanks,
Aaleem

I think your best best is to not do it.
Why are you wasting your time scraping this data one address at a time when you could have the entire city's data in under a minute.
JSON: https://data.edmonton.ca/resource/3pdp-qp95.json
CSV: https://data.edmonton.ca/api/views/q7d6-ambg/rows.csv?accessType=DOWNLOAD
XML: https://data.edmonton.ca/api/views/q7d6-ambg/rows.xml?accessType=DOWNLOAD
...among others. Heck, they even have !
And when you're done with that one, they have a few hundred other interesting datasets.

The trick was to create a function inside powerquery, and then use the query as part of a table. Create the function as below and then under the data tab select your table using "From Table/Range" from there it is pretty straight forward.
let a_value= (x as number,y as text)=> //this creates the function
let //this is essentially the query I wanted with some minor changes from above
x_text = Number.ToText(x, "D", ""),
Source = Json.Document(Web.Contents("https://data.edmonton.ca/resource/3pdp-qp95.json?house_number="&x_text&"&street_name="&y)),
Source1 = Source{0},
total_asmt = Source1[total_asmt]
in
total_asmt
in a_value //closes the function

Related

Is there a way use a loop to reiterate a table in power query?

I need a way to reiterate a table N amount of times. I have 1 main table (MainTable) which needs to be processed by each item in a List (ListofItems). It is processed in a function (Funct1) where it takes in 1 item in the ListofItems and transforms the MainTable into a new table (NewTable). But I want to iterate the NewTable through Funct1 again for the next item in the ListofItems.
For Example, the flow would be like this:
total N items in ListofItems,
MainTable ->Funct1(Item1) -> NewTable1
NewTable1 ->Funct1(Item2) -> NewTable2
...
NewTableN ->Funct1(ItemN) -> FinalTable
I have tried using List.Generate but I don't think this can produce a table, so far I could only manage lists.
The furthest I have gotten is with List.Accumulate, it is able to pass each list of items into the function. But I don't know how to make it pass the NewTable back into itself.
let
MainTable = #"MainTable",
Source2 = #"TableofList",
TotalRow = Table.RowCount(Source2),
ListofItem = Source2[Column1],
output =
List.Accumulate(
{0..TotalRow},
[Item=0,NewTable=MainTable],
(result, count) =>
(NewTable as table)=>
let
Item=ListofItem{count},
NewTable = Funct1(Item,NewTable)
in
NewTable
)
in
output
Please help, I feel like I'm so close to getting it right but I'm lost on what else I can do.
I have managed to solve my problem. I was already really close, I should have used NewTable as the result, and the MainTable as the seed. I'll leave it here in case anyone else needs something similar.
let
MainTable = #"MainTable",
Source2 = #"TableofList",
TotalRow = Table.RowCount(Source2),
ListofItem = Source2[Column1],
output =
List.Accumulate(
{0..(TotalRow-1)},
MainTable,
(NewTable, count) =>
let
Item=ListofItem{count},
NewTable = Funct1(Item,NewTable)
in
NewTable
)
in
output
For further information, Funct1 is actually a function to inner join a column on the main table to a definition table. As I have a few columns to join, I wanted a loop function. I wanted it to be dynamic as the columns to join change from project to project. But the structure is all the same. Each Item in the ListofItem would be the name of the column to join to a separate definition table of the same name.
I'm not sure if my approach is a good way to do it, but it's the way I could think of. I'm open to suggestions if there is a better way to do this.

Power Query (M) _ Dynamically update a column list for List.Sum function

I'm not sure if even possible but the goal is to dynamically update a query based on the user selecting a date. I have a table in my Excel file while updates a value which feeds to PeriodString variable (below)
/*Parameter name = PeriodString */
let
Source = Excel.CurrentWorkbook(){[Name="PeriodString"]}[Content],
StrPeriod = Source[Value]{0}
in
StrPeriod
The part of the code I want to update is the [ ..months selected ].
=List.Sum({[FYOpening],[January],[February],[March],[April],[May]})
With the below variable
=List.Sum({PeriodStr})
I tried using Table.Column as I realize I have to convert the value to a list of selectable columns but I cant' get it to work.
=List.Sum({Table.Column(PeriodString{0},PeriodString[0])})
Expression.Error: We cannot convert the value "[FY Opening],[Januar..." to type List.
Details:
Value=[FY Opening],[January],[February],[March],[April],[May]
Type=[Type]
Let me know if possible / alternatives.
If you need exactly value like "[Col1],[Col2],[Col3]" for PeriodString, then use such code:
let
Source = #table({"a".."e"},{{1..5}, {6..10}}),
PeriodString = "[b],[d],[e]",
sum = Table.AddColumn(Source, "sum", each List.Sum(Expression.Evaluate("{"&PeriodString&"}", [_=_])))
in
sum
I'd prefer to use PQ list instead:
let
Source = #table({"a".."e"},{{1..5}, {6..10}}),
list = {"b","d","e"},
sum = Table.AddColumn(Source, "sum", each List.Sum(Record.ToList(Record.SelectFields(_, list))))
in
sum

Use value from a column as paramater for json request and combine the table

I am using power query to load some json data in a table (matches). I want to use a specific part of that data (fixture_id) as a parameter for another json request in another query (predictions), and then combine that output in my main (matches) table. Anyone can point me in the right direction on how to do this ?
So here is my matches table:
And then in my fixtures table i can maybe i have:
apiKey = Excel.CurrentWorkbook(){[Name="ApiKey"]}[Content]{0}[Column1],
fixtureID = "?",
Source = Json.Document(Web.Contents("https://v2.api-football.com/predictions/" & fixtureID, [Headers=[#"X-RapidAPI-Key"=apiKey]])),
If i hardcode the fixtureID, i get this output:
But i want to calculate it dynamically, and then merge the output to the matches table.
The first step is to turn your request into a function that accepts parameters. Put your request on a new blank query:
let
fnGetData = (fixtureID as text) =>
let
apiKey = Excel.CurrentWorkbook(){[Name="ApiKey"]}[Content]{0}[Column1],
fixtureID = "?",
Source = Json.Document(Web.Contents("https://v2.api-football.com/predictions/"
& fixtureID, [Headers=[#"X-RapidAPI-Key"=apiKey]]))
in
Source
in
fnGetData
Rename it to fnGetData.
Then, go to your table and click on Add Column/Add Custom Function. Select fnGetData and the input parameter is your fixtureID column. This should make all the requests and you'll just have to expand the new column results.

Power Query: Function to search a column for a list of keywords and return only rows with at least one match

I am making a simple Google-like search function in Power Query.
Let's say I have a column called Description in a table called Database. The user then inputs some search queries like "dog, cat, animals". I want to filter Database for rows that contain at least one of these keywords. They keywords can change each time, depending on what the user types in a named range in Excel.
I know you can filter a column in Power Query for multiple keywords, like this:
FilterRows = Table.SelectRows(LastStep, each Text.Contains([English], "dog") or Text.Contains([English], "cat")),
but those keywords are static, and the column is also static. I want to be able to control both the keywords and the column name as variables. I think I need to write a function but I am not sure how to start.
Your question requires several moving parts.
First, I would get the keywords from a named range "Keywords" into a table like this:
{KeywordTbl}
let
GetKeywords = if Excel.CurrentWorkbook(){[Name="Keywords"]}[Content]{0}[Column1] = null then null else Text.Split(Excel.CurrentWorkbook(){[Name="Keywords"]}[Content]{0}[Column1], ", "),
ConvertToTable = Table.FromList(GetKeywords,null,{"Keywords"})
in
ConvertToTable
Secondly, store the column name where you want to search in an Excel named range called "ColName". Then pull the named range into Power Query like this:
{ColName}
let
GetColName = Excel.CurrentWorkbook(){[Name="ColName"]}[Content]{0}[Column1]
in
GetColName
Then I would write a function that takes 4 variables, the table and column you want to look in, and the table and column containing the keywords:
{SearchColForKeywords}
(LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>
let
RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin",
(Earlier) => Table.SelectRows(KeywordTbl,
each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))),
ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {"Keywords found"}),
FilterRows = Table.SelectRows(ExpandRelativeJoin, each [Keywords found] <> null and [Keywords found] <> ""),
// Concatenate multiple keyword founds line into one line
GroupAllData = Table.Group(FilterRows, {"Word ID"}, {{"AllData", each _, type table [First column=text, Second column=text, ... your other columns=text]}}),
AddCol = Table.AddColumn(GroupAllData, "Keywords found", each [AllData][Keywords found]),
ExtractValues = Table.TransformColumns(AddCol, {"Keywords found", each Text.Combine(List.Transform(_, Text.From), ", "), type text}),
DeleteAllData = Table.RemoveColumns(ExtractValues,{"AllData"}),
MergeQueries = Table.NestedJoin(DeleteAllData, {"Word ID"}, FilterRows, {"Word ID"}, "DeleteAllData", JoinKind.LeftOuter),
ExpandCols = Table.ExpandTableColumn(MergeQueries, "DeleteAllData", {"First Col name", "Second col name", ... "Your Other column names here"}),
DeleteKeywordsFound = Table.RemoveColumns(ExpandCols,{"Keywords found"})
in
DeleteKeywordsFound
FYI, half of this function has been developed by a user named lmkeF on PowerBI community. The full discussion is here. I merely improved on his solution.
Finally, I will use that function in another query like this:
StepName = SearchColForKeywords(MainTbl, KeywordTbl, ColName, "Keywords"),
You may customize the 4 variable names.

Excel Power Query: Variables for Table Name

I'm trying to achieve something that seems like it should be fairly simple but I can't find an answer for... replace the name of a table or power query with a variable.
Currently trying to do this with a merge query so it would look something like this:
Table.NestedJoin(VARIABLE1,key1,VARIABLE2,key2,"Append",JoinKind.Inner)
Currently getting all sorts of errors no matter what I try...
Thank you!
// Edit:
Not really looking to do a function - hoping for users to utilize as easy as possible so they would be able to update a named table in the workbook, refresh, and then get a table as an output. Here is my current code - hopefully that'll help. My Region code replacements worked fine, but the Days replacements don't - I need each day (Monday-Thursday) to be replaced with my day variables (StartDay, Day2, etc.). Each of those has a separate text query referring back to the excel workbook inputs, and each of them should pull up a query based on the text (ex: StartDay = Monday so should pull the Monday query). This is the error I get, assuming that it is reading it as text "Monday" and not query Monday.
Expression.Error: We cannot convert the value "Monday" to type Table.
Details:
Value=Monday
Type=Type
let
ANDOriginCode = OriginRegion,
ANDDestinationCode = DestinationRegion,
ANDStartDay = StartDay,
ANDDay2 = Day2,
ANDDay3 = Day3,
ANDDay4 = Day4,
ANDDay5 = Day5,
Source = Table.NestedJoin(Monday,{"Tuesday Destination Region Code"},Tuesday,{"Tuesday Origin Region Code"},"Append1 (3)",JoinKind.Inner),
#"Filtered Rows1" = Table.SelectRows(Source, each [Monday Origin Region Code] = OriginRegion),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows1",{"ID", "Pickup Day of Week", "Delivery Day of Week"}),
#"Expanded Append1 (3)" = Table.ExpandTableColumn(#"Removed Columns", "Append1 (3)", {"Tuesday Origin Region Code", "Wednesday Destination Region Code", "Tuesday Projected Number of Loads"}, {"Tuesday Origin Region Code", "Wednesday Destination Region Code", "Tuesday Projected Number of Loads"}),
#"Merged Queries" = Table.NestedJoin(#"Expanded Append1 (3)",{"Wednesday Destination Region Code"},Wednesday,{"Wednesday Origin Region Code"},"Append1 (4)",JoinKind.Inner),
#"Expanded Append1 (4)" = Table.ExpandTableColumn(#"Merged Queries", "Append1 (4)", {"Wednesday Origin Region Code", "Thursday Destination Region Code", "Wednesday Projected Number of Loads"}, {"Wednesday Origin Region Code", "Thursday Destination Region Code", "Wednesday Projected Number of Loads"})
#"Merged Queries1" = Table.NestedJoin(#"Expanded Append1 (4)",{"Thursday Destination Region Code"},Thursday,{"Thursday Origin Region Code"},"Append1 (5)",JoinKind.Inner)
in
#"Merged Queries1"
This might help:
let
Source = (VARIABLE1 as table, VARIABLE2 as table) => Table.NestedJoin(VARIABLE1, Key1, VARIABLE2, Key1, "Append", JoinKind.Inner)
in
Source
You can use parameters for Key1 and Key2. The function will prompt you to select your tables.
You can invoke it from any other query with:
Function.Invoke(Merge,{Table1,Table2})
Replace Merge with whatever you named the first query above and replace Table1 and Table2 with your target tables.
In case you're thinking of it, I have not been able to figure out how to pass tables from parameters. When you do that, the value you enter is recognized as text--for instance, "Table" versus Table--so it won't work. I could not find any information on how to pass a table value, like Table, in a variable. Anyhow, I hope this helps at least a little.
I was searching for this, too!
I finally found it, thanks to Chris Webb at https://blog.crossjoin.co.uk/2015/02/06/expression-evaluate-in-power-querym/
The key is using Expression.Evaluate with #shared as the second argument.
If you define Query1 as
let
Source = 1 + 1
in
Source
Query2 as
let
Source = 15 * 10
in
Source
define pIndex as a parameter that is "1" or "2", and
define QuerySwitch as
Expression.Evaluate("Query" & pIndex, #shared)
then QuerySwitch will return
2 when pIndex is "1"
150 when pIndex is "2"
My example:
I have a query QueryThatTakesFiveMinutes that
other queries use, and
writes to an Excel table (also named "QueryThatTakesFiveMinutes")
If I define a query "QueryThatTakesFiveMinutes Cached" by moving my cursor to the output QueryThatTakesFiveMinutes table in Excel and creating a new query from that table then, when I'm testing, I can change all the queries that use QueryThatTakesFiveMinutes to instead use #"QueryThatTakesFiveMinutes cached" and test downstream computation without waiting five minutes every time. Then I just need to remember to change it back when I'm ready.
But that was annoying.
I created a named range in Excel called "ProductionMode" that pointed to a specific cell that holds a value of either TRUE or FALSE
In Power-Query, I defined a very handy power query function called fNamedCellValue as
(rangeName as text) => Excel.CurrentWorkbook(){[Name=rangeName ]}[Content]{0}[Column1]
so that I can define a "ProductionMode" query as
fGetNamedCellValue("ProductionMode")
I use this in a way that's similar to the Index parameter above, but this way I can edit it via Excel.
When I defined "modeQueryThatTakesFiveMinutes" as
if ProductionMode then QueryThatTakesFiveMinutes else #"QueryThatTakesFiveMinutes Cached"
and changed all queries that use QueryThatTakesFiveMinutes to use modeQueryThatTakesFiveMinutes instead, I was very surprised to find that both QueryThatTakesFiveMinutes and #"QueryThatTakesFiveMinutes Cached" were evaluated and it didn't save any time at all!
So then after searching, being overjoyed to find your question only to realize it wasn't answered, then finding Chris Webb's article, I tried redefining modeQueryThatTakesFiveMinutes as
Expression.Evaluate(
if ProductionMode then
"QueryThatTakesFiveMinutes"
else
"#""QueryThatTakesFiveMinutes Cached""",
#shared
)
Unfortunately, instead of working, I got an error of
Formula.Firewall: Query 'modeQueryThatTakesFiveMinutes' references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.
However, I found a way around this, too, by putting the offending code within a function that the consuming query executes.
Deleting ProductionMode and defining a new query fProductionMode of
() => fGetNamedCellValue("ProductionMode") as logical
now doesn't return true or false, it returns a function that will return true or false when evaluated. Why is one legal and the other isn't? I don't know, but it is! Change the definition of modeQueryThatTakesFiveMinutes to
Expression.Evaluate(
if fProductionMode() then
"QueryThatTakesFiveMinutes"
else
"#""QueryThatTakesFiveMinutes Cached""",
#shared
)
and it works!

Resources