Problem Summary
I have a column in my Power Query table which contains a custom linked data type. Creating a custom linked data type filled with all null values is not desired. Instead, if all the values contained in the custom data type is null, I would like the value in the column to be null.
Background
I have a table which holds API response JSON text. This JSON text contains a list of search results (also in JSON), representing movies which match search criteria delivered in the request. There can be any number of search results, including zero. Using Power Query M, I parse these JSON texts with the built-in parser, which generates a list containing one record per search result. I then extract the first record in the list, expand that record into new columns, and combine those new columns into a custom data type.
Example
Here is an example query simulating only the problem area of my query. This example is fully contained and can be used to reproduce my issue exactly.
let
// These two variables holds the API response JSON text obtained from calls to Web.Contents().
// I've eliminated the actual calls in this example because that part of my query works fine.
Search_Fast_and_Furious_Response =
"{ ""total-results"":""2"", ""results"":[
{ ""title"":""Fast & Furious"", ""year"":""2009"" },
{ ""title"":""The Fast and the Furious"", ""year"":""2001"" } ] }",
Search_mmmmm_Response =
"{ ""total-results"":""0"", ""results"":[] }",
// Create the table to hold the response text.
Source = Table.FromRecords( { [#"API Response"=Search_Fast_and_Furious_Response],
[#"API Response"=Search_mmmmm_Response] }),
// Parse the JSON and put the output (a record) in a new column.
#"Insert Parsed JSON" = Table.AddColumn(Source, "JSON", each Json.Document([API Response])),
// Expand the record in the parsed JSON column. Each field in the record becomes a new column.
#"Expand JSON" = Table.ExpandRecordColumn(#"Insert Parsed JSON", "JSON",
{"total-results", "results"}, {"Result Count", "Results List"}),
// Add a new column to hold the first search result in the responses results list.
// This is also a record, like the parsed JSON two steps ago.
#"Add Result #1 Column" = Table.AddColumn(#"Expand JSON", "Result #1", each
try _[Results List]{0}
otherwise null), // In case the list is empty
// Expand the record in the Result #1 column.
#"Expand Result #1" = Table.ExpandRecordColumn(#"Add Result #1 Column", "Result #1",
{"title", "year"}, {"Title", "Year"}),
// Combine the newly expanded columns into a single column.
// Make the Display Name be the value in the Title field/column,
// and make the Type Name be "Excel.DataType."
// This is what creates the custom linked data type.
#"Combine Result #1" = Table.CombineColumnsToRecord(#"Expand Result #1", "Result #1",
{"Title", "Year"}, [ DisplayNameColumn = "Title", TypeName="Excel.DataType" ])
in
#"Combine Result #1"
The list in the very last line before the in statement, i.e. the fourth parameter to the Table.CombineColumnsToRecord function, allows the record to be used as a custom data type used with Excel's new linked data feature. I'm not certain, but I believe Power Query/Excel stores them as records with additional metadata, such as DisplayNameColumn and TypeName (the latter of which I'm sure is the most important part).
Problem and Goal
Here is the resulting table created by the example query. The bottom-right cell is selected. Its contents are shown at the bottom of the image. The cell itself contains a value, specifically a record with all values set to null. Because the Title field is null, the record's display text is "null."
This next picture shows my desired output. Notice again the bottom-right cell. This time, the cell is empty. It no longer contains a record with all values being null; now it contains nothing, so the display shown in this view is null, italicized so as to indicate a null value as opposed to the word "null." (Note: I've been unable to change the "null" cell in the first image to a literal null value, so to demonstrate, I simply added a new column of null values.)
Unfortunately, because of my otherwise clause after the try, the column "Result #1" may be null if the API returned zero search results. If this value is null in any row, then all of the new columns created by #"Expand Result #1" will contain null in that row, also. Finally, when all the null values are combined in the last step, I'm left with a record with all null values. Instead, what I hope to achieve is to have a single null value (of type null) in that cell.
Efforts So Far
I have tried the Table.ReplaceValues function, passing null as the new value and many different values as the old value (the one to be replaced), such as a new record with all null values. All those attempts have either been syntactically incorrect or resulted in expected and unwanted behavior. I have also tried using the "Replace Values" option in the Power Query GUI, but the same result occurs. In case ReplaceValues didn't like nulls, I've also tried using a different value in the otherwise clause, such as "N/A" of type text, then doing a ReplaceValues on that different value. This yielded the same result.
Conclusion
Is there any way I can replace a record—which is filled with null values and is stored in a column containing records—with a singular null value? The linked data type feature is a high priority in this situation, so I would prefer a solution that retains that feature (though of course all solutions are welcome).
I have "solved" my problem. While not technically a solution to the question I posted, I've achieved the desired result using a workaround.
Instead of dealing with the object full of null fields, I ensure that object is not converted to the custom object to begin with. I achieve this by moving all records with a null value after extracting the first List item in the Results List column; this is done before I expand that extracted item. After putting the nulls in a new table (which I call the Null Table), I delete those nulls from the first table (which I call the Non-Null Table). I perform the regular operations on the Non-Null Table to create the custom linked data type for only those rows that were not null. Afterward, I merge the two tables together again.
The full code containing the solution with my representative example is below, with new steps "highlighted" with non-indented comments.
let
// These two variables holds the API response JSON text obtained from calls to Web.Contents().
// I've eliminated the actual calls in this example because that part of my query works fine.
Search_Fast_and_Furious_Response =
"{ ""total-results"":""2"", ""results"":[
{ ""title"":""Fast & Furious"", ""year"":""2009"" },
{ ""title"":""The Fast and the Furious"", ""year"":""2001"" } ] }",
Search_mmmmm_Response =
"{ ""total-results"":""0"", ""results"":[] }",
// Create the table to hold the response text.
Source = Table.FromRecords( { [#"API Response"=Search_Fast_and_Furious_Response],
[#"API Response"=Search_mmmmm_Response] }),
// Parse the JSON and put the output (a record) in a new column.
#"Insert Parsed JSON" = Table.AddColumn(Source, "JSON", each Json.Document([API Response])),
// Expand the record in the parsed JSON column. Each field in the record becomes a new column.
#"Expand JSON" = Table.ExpandRecordColumn(#"Insert Parsed JSON", "JSON",
{"total-results", "results"}, {"Result Count", "Results List"}),
// Add a new column to hold the first search result in the responses results list.
// This is also a record, like the parsed JSON two steps ago.
#"Add Result #1 Column" = Table.AddColumn(#"Expand JSON", "Result #1", each
try _[Results List]{0}
otherwise null), // In case the list is empty
// New step
// Filter down to only rows with null in the new column. Save this new table for later.
#"Filter In Null" = Table.SelectRows(#"Add Result #1 Column", each _[#"Result #1"] = null),
// New step
// Filter down to only rows with NOT null in the new column.
#"Filter Out Null" = Table.SelectRows(#"Add Result #1 Column", each _[#"Result #1"] <> null),
// Expand the record in the Result #1 column.
#"Expand Result #1" = Table.ExpandRecordColumn(#"Filter Out Null", "Result #1",
{"title", "year"}, {"Title", "Year"}),
// Combine the newly expanded columns into a single column.
// Make the Display Name be the value in the Title field/column,
// and make the Type Name be "Excel.DataType."
// This is what creates the custom linked data type.
#"Combine Result #1" = Table.CombineColumnsToRecord(#"Expand Result #1", "Result #1",
{"Title", "Year"}, [ DisplayNameColumn = "Title", TypeName="Excel.DataType" ]),
// New step
// Convert the Null Table into a list of records.
#"Convert Table" = Table.ToRecords(#"Filter In Null"),
// New step
// Append the Null Table from earlier to the main table.
#"Combine Tables" = Table.InsertRows(#"Combine Result #1", Table.RowCount(#"Combine Result #1"),
#"Convert Table")
in
#"Combine Tables"
Is there a way to get the index of the results within an aql query?
Something like
FOR user IN Users sort user.age DESC RETURN {id:user._id, order:{index?}}
If you want to enumerate the result set and store these numbers in an attribute order, then this is possible with the following AQL query:
LET sorted_ids = (
FOR user IN Users
SORT user.age DESC
RETURN user._key
)
FOR i IN 0..LENGTH(sorted_ids)-1
UPDATE sorted_ids[i] WITH { order: i+1 } IN Users
RETURN NEW
A subquery is used to sort users by age and return an array of document keys. Then a loop over a numeric range from the first to the last index of the that array is used to iterate over its elements, which gives you the desired order value (minus 1) as variable i. The current array element is a document key, which is used to update the user document with an order attribute.
Above query can be useful for a one-off computation of an order attribute. If your data changes a lot, then it will quickly become stale however, and you may want to move this to the client-side.
For a related discussion see AQL: Counter / enumerator
If I understand your question correctly - and feel free to correct me, this is what you're looking for:
FOR user IN Users
SORT user.age DESC
RETURN {
id: user._id,
order: user._key
}
The _key is the primary key in ArangoDB.
If however, you're looking for example data entered (in chronological order) then you will have to have to set the key on your inserts and/or create a date / time object and filter using that.
Edit:
Upon doing some research, I believe this link might be of use to you for AI the keys: https://www.arangodb.com/2013/03/auto-increment-values-in-arangodb/
Given a CouchDB view that emits keys of the following format:
[ "part1", { "property": "part2" } ]
How can you find all documents with a given value for part1?
If part2 was a simple string rather than an object startkey=["part1"]&endkey=["part1",{}] would work. The CouchDB docs state the following:
The query startkey=["foo"]&endkey=["foo",{}] will match most array keys with "foo" in the first element, such as ["foo","bar"] and ["foo",["bar","baz"]]. However it will not match ["foo",{"an":"object"}]
Unfortunately, the documentation doesn't offer any suggestion on how to deal with such keys.
The second element of your endkey value needs to be an object that collates after any possible value of the second element of your key. Objects are compared by property-by-property (for example, {"a":1} < {"a":2} < {"b":1}) so the best way to do this is to set the first property name in your endkey to a very large value:
startkey=["part1"]&endkey=["part1", { "\uFFF0": false }]
The property name of \uFFF0 should collate after any other property names in the second key element, and even works when the second element is an empty object or has more than one property.
$queryBuilder
->add('select', 'd.item')
->add('from', 'Entities:TypeDetail d')
->add('where', $queryBuilder->expr()->andx(
$queryBuilder->expr()->gt('d.dateValue', $dates['start']),
$queryBuilder->expr()->lt('d.dateValue', $dates['end']))
);
TypeDetail (the table) has the following fields:
id, dateValue, itemId
And the model in Symfony is:
id, dateValue, item (object)
What I want to do is get a result containing only the item objects. I don't want the item id or any of the date values (while I do need them to filter the query, I don't actually care about them coming back in the response).
Is that possible? Obviously d.item as the select is not working!
Cheers
I have a Couchdb database with documents of the form: { Name, Timestamp, Value }
I have a view that shows a summary grouped by name with the sum of the values. This is straight forward reduce function.
Now I want to filter the view to only take into account documents where the timestamp occured in a given range.
AFAIK this means I have to include the timestamp in the emitted key of the map function, eg. emit([doc.Timestamp, doc.Name], doc)
But as soon as I do that the reduce function no longer sees the rows grouped together to calculate the sum. If I put the name first I can group at level 1 only, but how to I filter at level 2?
Is there a way to do this?
I don't think this is possible with only one HTTP fetch and/or without additional logic in your own code.
If you emit([time, name]) you would be able to query startkey=[timeA]&endkey=[timeB]&group_level=2 to get items between timeA and timeB grouped where their timestamp and name were identical. You could then post-process this to add up whenever the names matched, but the initial result set might be larger than you want to handle.
An alternative would be to emit([name,time]). Then you could first query with group_level=1 to get a list of names [if your application doesn't already know what they'll be]. Then for each one of those you would query startkey=[nameN]&endkey=[nameN,{}]&group_level=2 to get the summary for each name.
(Note that in my query examples I've left the JSON start/end keys unencoded, so as to make them more human readable, but you'll need to apply your language's equivalent of JavaScript's encodeURIComponent on them in actual use.)
You can not make a view onto a view. You need to write another map-reduce view that has the filtering and makes the grouping in the end. Something like:
map:
function(doc) {
if (doc.timestamp > start and doc.timestamp < end ) {
emit(doc.name, doc.value);
}
}
reduce:
function(key, values, rereduce) {
return sum(values);
}
I suppose you can not store this view, and have to put it as an ad-hoc query in your application.